A comprehensive guide to the BFF architectural pattern — what it is, why it exists, and how companies like Netflix use it to power multi-device experiences at scale.
- What is the BFF Pattern?
- The Problem BFF Solves
- Core Principles
- BFF Architecture Deep Dive
- BFF vs API Gateway
- Real-World Example: Netflix
- When to Use BFF
- Tradeoffs & Pitfalls
- BFF with GraphQL
- Summary
Backend for Frontend (BFF) is an architectural pattern where you create a dedicated backend service for each distinct frontend client — web, mobile, TV, etc. Each BFF acts as a tailored API layer that aggregates, transforms, and shapes data from downstream microservices into exactly what its paired frontend needs.
The pattern was first described by Sam Newman (of Building Microservices fame) and has since been widely adopted by companies operating at scale with multiple client types.
┌──────────────┐
│ Web App │◄──── BFF (Web)
└──────────────┘ │
▼
┌──────────────┐ ┌────────────────────────────────┐
│ iOS App │◄──┤ Downstream Microservices │
└──────────────┘ │ - Auth Service │
BFF (Mobile) │ - User Service │
│ - Content Service │
┌──────────────┐ │ - Recommendation Engine │
│ Smart TV │◄──┤ - Billing Service │
└──────────────┘ └────────────────────────────────┘
BFF (TV)
Each BFF is owned by the same team that builds the frontend. This is a crucial detail — it eliminates the coordination overhead of asking a shared API team to accommodate every client's evolving needs.
Without BFF, teams typically build a single, general-purpose API that tries to serve all clients. This creates immediate tension:
Mobile clients want:
- Minimal payloads (bandwidth is precious)
- Fewer round trips (latency compounds on mobile networks)
- Offline-friendly response shapes
Web clients want:
- Rich, detailed data (fast broadband, large screens)
- More flexibility in filtering and sorting
- Real-time updates via WebSockets
Smart TV clients want:
- Pre-aggregated content rows (limited compute on device)
- Simplified auth flows
- Optimized for 10-foot UI navigation patterns
A single API serving all three will inevitably be a compromise — over-fetching for mobile, under-fetching for TV, and constantly negotiating breaking changes between teams.
// Generic API response — a User object with everything
{
"id": "u_abc123",
"email": "jane@example.com",
"firstName": "Jane",
"lastName": "Doe",
"billingAddress": { ... }, // Mobile doesn't need this
"paymentMethods": [ ... ], // TV doesn't need this
"notificationPreferences": { ... }, // Web dashboard needs this
"watchHistory": [ ... ], // 200 items — mobile only wants 5
"devices": [ ... ],
"subscriptionDetails": { ... }
}Mobile pays the bandwidth cost for fields it ignores. This is the problem BFF eliminates.
Each distinct client surface gets its own BFF. The boundary is typically drawn around interaction model — if two clients have fundamentally different UX paradigms, they get separate BFFs.
✅ Good BFF boundaries:
- Web BFF (browser, keyboard/mouse)
- Mobile BFF (iOS + Android, touch, push notifications)
- TV BFF (Smart TV, remote, lean-back experience)
❌ Too granular (usually):
- iOS BFF + Android BFF (same interaction model)
- Chrome BFF + Firefox BFF (same interaction model)
The BFF lives in the same repo (or a closely related one) as the frontend it serves, and is deployed, monitored, and iterated on by the same team. This eliminates the "API team as gatekeeper" bottleneck.
If another team wants your BFF's data, they talk to the downstream services directly or build their own BFF. Sharing a BFF recreates the general-purpose API problem.
A BFF should do aggregation, transformation, and protocol translation — not business logic. Business logic lives in the downstream services. A BFF that accumulates business rules becomes a maintenance liability.
Client Request
│
▼
┌─────────────────────────────────────────┐
│ BFF │
│ │
│ 1. Authenticate / authorize request │
│ 2. Fan out to N downstream services │ ──► Service A
│ 3. Await responses (parallel) │ ──► Service B
│ 4. Aggregate + merge data │ ──► Service C
│ 5. Transform to client-optimal shape │
│ 6. Apply client-specific logic │
│ (feature flags, A/B, locale) │
└─────────────────────────────────────────┘
│
▼
Client Response (right-sized for this client)
| Responsibility | Description |
|---|---|
| Request aggregation | Fan out to multiple services, merge results |
| Response shaping | Return only fields the client needs |
| Protocol translation | e.g., gRPC downstream → REST/JSON upstream |
| Auth token handling | Exchange client tokens for service-to-service tokens |
| Caching | Client-appropriate caching strategy |
| Error normalization | Translate downstream errors into client-friendly messages |
| Feature flagging | Client-specific feature rollout logic |
| Locale / i18n | Format dates, currency, strings per locale |
These two are frequently confused. They solve different problems and are often used together.
| API Gateway | BFF | |
|---|---|---|
| Purpose | Cross-cutting concerns (auth, rate limiting, routing) | Client-specific data aggregation & shaping |
| Owned by | Platform/infra team | Frontend team |
| Client-specific? | No — handles all clients | Yes — one per client type |
| Business logic? | No | Minimal |
| Typical tech | Kong, AWS API Gateway, Nginx | Node.js, Go, custom service |
| Number of instances | One (or one per environment) | One per client type |
In practice: Requests flow Client → API Gateway → BFF → Microservices. The gateway handles SSL termination, rate limiting, and auth token verification. The BFF handles the application-level aggregation.
Netflix serves content on thousands of device types: browsers, iOS, Android, Smart TVs (Samsung, LG, Sony), game consoles (PlayStation, Xbox), streaming sticks (Roku, Fire TV), and more. Each device has different:
- Screen resolutions and image aspect ratio requirements
- Memory and compute constraints
- UI paradigms (touch vs remote vs mouse)
- Network conditions and bandwidth budgets
- Feature sets (some devices can't play Dolby Atmos, etc.)
A single API serving all these devices would be unworkable.
Netflix's Device Experience (DX) team pioneered an approach where each device category gets a purpose-built API layer (their internal term is "Experience API" or "Device API", functionally equivalent to BFF).
┌───────────────────────────────┐
│ API Gateway │
│ (Auth, Rate Limiting, TLS) │
└──────────────┬────────────────┘
│
┌────────────────────┼────────────────────┐
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Web BFF │ │ Mobile BFF │ │ TV/Device BFF │
│ (React app) │ │ (iOS/Android) │ │ (Smart TV, │
│ │ │ │ │ consoles) │
│ - Full metadata │ │ - Compressed │ │ - Pre-aggregated│
│ - Social share │ │ thumbnails │ │ content rows │
│ - Downloads UI │ │ - Push notif. │ │ - DRM config │
│ - Account mgmt │ │ - Offline mode │ │ - Device caps │
└────────┬────────┘ └────────┬────────┘ └────────┬────────┘
│ │ │
└────────────────────┼────────────────────┘
│ (internal network)
┌───────────────────┼───────────────────────┐
│ │ │
▼ ▼ ▼
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────────┐
│ Catalog Service │ │ User Service │ │ Recommendation Engine│
├──────────────────┤ ├──────────────────┤ ├──────────────────────┤
│ Playback Service│ │ Billing Service │ │ Search Service │
└──────────────────┘ └──────────────────┘ └──────────────────────┘
Let's build a simplified version of Netflix's "Homepage" API — the call that loads a user's personalized home screen rows (Continue Watching, Trending, My List, etc.).
These exist independently and serve all clients:
// user-service: GET /users/:id
// Returns full user object
{
id: "u_abc123",
profile: { name: "Jane", avatarUrl: "...", language: "en-US" },
subscription: { tier: "standard", expiresAt: "2026-01-01" },
preferences: { autoplay: true, maturityRating: "TV-14" }
}
// recommendations-service: GET /recommendations/:userId
// Returns ranked list of content IDs
{ contentIds: ["m_123", "m_456", "m_789", ...] } // up to 500 items
// catalog-service: POST /catalog/batch
// Body: { ids: string[] }
// Returns full content metadata
[
{
id: "m_123",
title: "Stranger Things",
type: "series",
seasonCount: 4,
episodeCount: 34,
rating: "TV-14",
images: {
hero: "https://cdn.nflx.com/hero/st.jpg", // 1920x1080
thumbnail16x9: "https://cdn.nflx.com/16x9/st.jpg", // 480x270
thumbnail2x3: "https://cdn.nflx.com/2x3/st.jpg", // 200x300
logo: "https://cdn.nflx.com/logo/st.png"
},
genres: ["Drama", "Sci-Fi", "Horror"],
synopsis: "...", // 500+ chars
cast: [ /* 50 members */ ],
// ... 40+ more fields
}
]
// continue-watching-service: GET /continue-watching/:userId
{ items: [{ contentId: "m_789", progressSeconds: 1234, totalSeconds: 3600 }] }
// my-list-service: GET /my-list/:userId
{ contentIds: ["m_222", "m_333"] }// mobile-bff/src/routes/homepage.ts
import { FastifyRequest, FastifyReply } from 'fastify';
import { parallelFetch } from '../utils/parallelFetch';
import { buildContentRow } from '../transformers/contentRow';
interface HomepageRequest {
Params: { userId: string };
Headers: { 'x-device-model': string; 'x-os-version': string };
}
export async function mobileHomepage(
request: FastifyRequest<HomepageRequest>,
reply: FastifyReply
) {
const { userId } = request.params;
const deviceModel = request.headers['x-device-model'];
// Step 1: Fan out to all required services IN PARALLEL
// Mobile doesn't need billing, account details, etc.
const [user, recommendations, continueWatching, myList] = await parallelFetch([
fetchUser(userId),
fetchRecommendations(userId),
fetchContinueWatching(userId),
fetchMyList(userId),
]);
// Step 2: Determine which content IDs we need metadata for
// Mobile home screen shows max 20 items per row, 4 rows = 80 items max
const recommendedIds = recommendations.contentIds.slice(0, 20);
const myListIds = myList.contentIds.slice(0, 20);
const continueIds = continueWatching.items.map(i => i.contentId);
const allIds = [...new Set([...continueIds, ...recommendedIds, ...myListIds])];
// Step 3: Batch fetch catalog data (one request, not N requests)
const catalog = await fetchCatalogBatch(allIds);
const catalogMap = new Map(catalog.map(c => [c.id, c]));
// Step 4: Shape the response for mobile
// - Use 2x3 portrait thumbnails (better for mobile grid)
// - Truncate synopsis to 120 chars
// - Omit cast, full genre list, etc.
// - Include progress data inline for Continue Watching
const rows = [
continueWatching.items.length > 0 && {
id: 'continue-watching',
title: 'Continue Watching',
layout: 'landscape', // 16x9 for continue watching
items: continueWatching.items.map(item => {
const content = catalogMap.get(item.contentId);
if (!content) return null;
return {
contentId: content.id,
title: content.title,
thumbnail: content.images.thumbnail16x9, // landscape for progress bar
progressPercent: Math.round((item.progressSeconds / item.totalSeconds) * 100),
progressSeconds: item.progressSeconds,
// Mobile-specific: show episode info if series
subtitle: content.type === 'series'
? `S${item.currentSeason} E${item.currentEpisode}`
: formatDuration(item.totalSeconds - item.progressSeconds) + ' left',
};
}).filter(Boolean),
},
{
id: 'my-list',
title: 'My List',
layout: 'portrait', // 2x3 portrait grid
items: myListIds.map(id => {
const content = catalogMap.get(id);
if (!content) return null;
return {
contentId: content.id,
title: content.title,
thumbnail: content.images.thumbnail2x3, // portrait for grid
// Mobile omits: synopsis, cast, full genres
};
}).filter(Boolean),
},
{
id: 'recommended',
title: `Top Picks for ${user.profile.name}`,
layout: 'portrait',
items: recommendedIds.map(id => {
const content = catalogMap.get(id);
if (!content) return null;
return {
contentId: content.id,
title: content.title,
thumbnail: content.images.thumbnail2x3,
rating: content.rating,
// Truncated synopsis for mobile — 120 chars max
synopsis: content.synopsis.length > 120
? content.synopsis.slice(0, 117) + '...'
: content.synopsis,
isNew: isNewRelease(content.releaseDate),
};
}).filter(Boolean),
},
].filter(Boolean);
// Step 5: Add mobile-specific metadata
return reply.send({
profile: {
name: user.profile.name,
avatarUrl: user.profile.avatarUrl,
// Mobile needs push notification token refresh signal
shouldRefreshPushToken: shouldRefreshToken(user),
},
rows,
// Mobile-specific: used to configure offline download UI
downloadEnabled: user.subscription.tier !== 'basic',
// Cache hint: tell the mobile app how long to cache this response
meta: {
cacheMaxAgeSeconds: 60,
requestId: request.id,
},
});
}// tv-bff/src/routes/homepage.ts
export async function tvHomepage(
request: FastifyRequest<HomepageRequest>,
reply: FastifyReply
) {
const { userId } = request.params;
const deviceCapabilities = parseDeviceCapabilities(request.headers);
const [user, recommendations, continueWatching, myList] = await parallelFetch([
fetchUser(userId),
fetchRecommendations(userId),
fetchContinueWatching(userId),
fetchMyList(userId),
]);
// TV shows more items per row — 40 instead of 20
// TV also pre-fetches hero metadata for first item (no hover, remote navigation)
const recommendedIds = recommendations.contentIds.slice(0, 40);
const allIds = [...new Set([
...continueWatching.items.map(i => i.contentId),
...recommendedIds,
...myList.contentIds.slice(0, 40),
])];
const catalog = await fetchCatalogBatch(allIds);
const catalogMap = new Map(catalog.map(c => [c.id, c]));
const rows = [
// TV: First item gets hero treatment (no hover state on remote)
{
id: 'hero',
type: 'hero-banner',
// Full hero image, full synopsis, logo overlay
item: buildHeroItem(catalogMap.get(recommendations.contentIds[0]), deviceCapabilities),
},
continueWatching.items.length > 0 && {
id: 'continue-watching',
title: 'Continue Watching',
layout: 'landscape',
items: continueWatching.items.map(item => {
const content = catalogMap.get(item.contentId);
return {
contentId: content.id,
title: content.title,
// TV: always use large 16x9 thumbnails
thumbnail: content.images.thumbnail16x9,
progressPercent: Math.round((item.progressSeconds / item.totalSeconds) * 100),
// TV: pre-render the DRM license URL to reduce playback start time
drmLicenseUrl: buildDrmUrl(content.id, deviceCapabilities.drmSystem),
// TV: include Dolby Atmos / 4K badges based on device caps
badges: buildBadges(content, deviceCapabilities),
};
}),
},
{
id: 'trending',
title: 'Trending Now',
layout: 'landscape',
// TV shows 40 items (user scrolls with remote, loading more is janky)
items: recommendedIds.map(id => {
const content = catalogMap.get(id);
return {
contentId: content.id,
title: content.title,
thumbnail: content.images.thumbnail16x9,
// TV: include logo overlay for branded look
logoUrl: content.images.logo,
// TV: pre-bake the maturity rating badge
maturityBadge: content.rating,
badges: buildBadges(content, deviceCapabilities),
};
}),
},
].filter(Boolean);
return reply.send({
profile: {
name: user.profile.name,
avatarUrl: user.profile.avatarUrl,
},
rows,
// TV-specific: ambient mode background when idle
ambientModeEnabled: deviceCapabilities.supportsAmbientMode,
// TV-specific: DRM system config upfront (avoids round trip on play)
drmSystem: deviceCapabilities.drmSystem,
meta: {
// TV can cache longer — user browses more slowly
cacheMaxAgeSeconds: 120,
},
});
}// shared/utils/parallelFetch.ts
/**
* Executes multiple service calls in parallel.
* Fails fast if any call fails (adjust with Promise.allSettled for resilience).
*/
export async function parallelFetch<T extends readonly unknown[]>(
promises: { [K in keyof T]: Promise<T[K]> }
): Promise<T> {
return Promise.all(promises) as Promise<T>;
}
/**
* Resilient variant — partial failures return null instead of throwing.
* Use when downstream services are optional (e.g., recommendations degrading gracefully).
*/
export async function parallelFetchResilient<T>(
calls: Array<{ key: string; fn: () => Promise<T>; fallback: T }>
): Promise<Record<string, T>> {
const results = await Promise.allSettled(calls.map(c => c.fn()));
return Object.fromEntries(
calls.map((c, i) => {
const result = results[i];
return [c.key, result.status === 'fulfilled' ? result.value : c.fallback];
})
);
}The same underlying data, shaped for each client:
Mobile Response (homepage) TV Response (homepage)
───────────────────────── ─────────────────────────
{ {
profile: { profile: { name, avatarUrl },
name, avatarUrl,
shouldRefreshPushToken: true, rows: [
}, { id: 'hero', type: 'hero-banner',
item: { fullSynopsis, logoUrl,
rows: [ heroImage (1920x1080),
{ drmLicenseUrl } },
id: 'continue-watching',
layout: 'landscape', { id: 'continue-watching',
items: [ items: [{ thumbnail16x9,
{ progressPercent,
thumbnail16x9, drmLicenseUrl,
progressPercent, badges: ['4K','Atmos'] }]
subtitle: 'S3 E4', },
}
] { id: 'trending',
}, items: [{ thumbnail16x9,
{ logoUrl,
id: 'recommended', maturityBadge }]
layout: 'portrait', },
items: [
{ ],
thumbnail2x3, ambientModeEnabled: true,
synopsis: '120 chars...', drmSystem: 'widevine',
isNew: true,
} meta: { cacheMaxAgeSeconds: 120 }
] }
}
],
downloadEnabled: true,
meta: { cacheMaxAgeSeconds: 60 }
}
Mobile gets portrait thumbnails, push token signals, and a 60-second cache. TV gets landscape hero images, pre-baked DRM URLs, Dolby/4K badges, and a 120-second cache. Both come from the same downstream services — shaped by their respective BFFs.
- You have multiple distinct client types with genuinely different data needs
- Frontend teams own their delivery end-to-end and want control over the API contract
- You need to aggregate multiple microservices into a single client call
- Different clients need different caching strategies or auth flows
- You want to iterate the API without coordinating across a shared API team
- You have a single client (just build a regular API)
- All your clients are functionally identical (same interaction model, same data needs)
- Your team is small — BFF adds operational overhead (more services to deploy and monitor)
- Your backend is a simple CRUD app with no meaningful aggregation needs
- Frontend velocity: Teams ship independently without API team bottlenecks
- Right-sized payloads: No over-fetching or under-fetching
- Resilience isolation: A TV BFF outage doesn't affect mobile
- Security: Sensitive fields never leave the BFF if the client doesn't need them
- Client-optimized caching: Each client gets its own cache TTLs
Code duplication. Multiple BFFs often duplicate logic (auth, error handling, service clients). Mitigate with shared internal libraries for common concerns — but resist the urge to share so much that you've effectively built a monolith again.
BFF sprawl. Without discipline, BFFs accumulate business logic. If your BFF is making decisions about pricing, eligibility, or content licensing, those rules belong in a downstream service.
Testing overhead. Each BFF is a service that requires its own integration tests, load tests, and monitoring.
Latency amplification. BFFs add a network hop. Keep them in the same datacenter/region as downstream services, and always fan out requests in parallel, never sequentially.
GraphQL is a natural fit for BFF — you get the aggregation and shaping benefits with a typed query language. Instead of REST endpoints, each BFF exposes a GraphQL schema tuned for its client.
// mobile-bff/src/schema.graphql
type Query {
homepage(userId: ID!): HomepageResult!
}
type HomepageResult {
profile: MobileProfile!
rows: [ContentRow!]!
downloadEnabled: Boolean!
}
type MobileProfile {
name: String!
avatarUrl: String!
shouldRefreshPushToken: Boolean!
}
type ContentRow {
id: ID!
title: String!
layout: RowLayout!
items: [ContentItem!]!
}
type ContentItem {
contentId: ID!
title: String!
thumbnail: String! # Already the right size for mobile
progressPercent: Int # Only present for Continue Watching
synopsis: String # Already truncated to 120 chars
isNew: Boolean
}
enum RowLayout { LANDSCAPE PORTRAIT }The mobile GraphQL BFF enforces the mobile contract at the schema level — clients literally cannot request a 1920x1080 hero image because the field doesn't exist in the mobile schema.
The BFF pattern solves the fundamental tension between general-purpose APIs and client-specific needs. By giving each client its own backend, teams can:
- Move fast without coordination overhead
- Serve optimal payloads for every device
- Isolate failures across client surfaces
- Give ownership of the full delivery stack to the team closest to the user
Netflix's approach — separate experience APIs for web, mobile, and TV — is the pattern at scale: thousands of device types served by a manageable set of client-type BFFs, all drawing from the same pool of downstream microservices.
The key discipline is keeping BFFs thin. They aggregate, transform, and translate — they do not make business decisions. Business logic belongs in the services downstream. A BFF that drifts into business logic becomes the very bottleneck it was designed to eliminate.
Further Reading:
- Sam Newman — Pattern: Backends For Frontends
- Netflix Tech Blog — Engineering Around the World with Device Experience APIs
- Building Microservices, 2nd Edition — Sam Newman (O'Reilly)