carefree-ladka/Frontend System Design: Scalability & Performance.mdx

Frontend System Design: Scalability & Performance

A structured interview reference guide covering scalability, asset optimization, and performance — organized by topic with theory, code snippets, and polished answer templates.

1. What "Scalability" Means in Frontend

Common misconception: Scalability = servers handling more requests.

In frontend system design interviews, scalability means something broader:

Dimension	Question to Answer
User scale	Can the UI handle more concurrent users?
Data scale	Can it handle 100 items growing to 1M?
Feature scale	Can it handle 5 screens growing to 200?
Developer scale	Can 50 engineers work on it without chaos?
Device scale	Does it work on slow devices and bad networks?

Key insight for interviews:

Frontend scalability = "How does the app behave when complexity increases?" — not just traffic.

2. Scalability Dimensions

2.1 Performance Scalability (Data Growth)

The question: What if data grows from 100 items → 1,000,000 items?

Virtualization

Only render items currently visible in the viewport. Avoid creating 1M DOM nodes.

// react-window style thinking
<List
  height={500}
  itemCount={1000000}
  itemSize={40}
>
  {({ index, style }) => (
    <div style={style}>Row {index}</div>
  )}
</List>

Libraries: react-window, react-virtualized, @tanstack/virtual

Pagination & Infinite Scroll

Load data in chunks. Never load everything at once.

Offset-based: /api/items?page=2&limit=50
Cursor-based: /api/items?cursor=xyz&limit=50 (preferred for large datasets — avoids drift)

Memoization

Prevent re-rendering components that haven't changed.

const ExpensiveComponent = React.memo(({ data }) => {
  return <div>{data.title}</div>;
});

Debouncing & Throttling

Control frequency of expensive operations triggered by user events.

// Debounce: fires after user stops typing (300ms)
const debouncedSearch = debounce(fetchResults, 300);

// Throttle: fires at most once every 200ms during scroll
const throttledScroll = throttle(handleScroll, 200);

2.2 Codebase Scalability (Feature Growth)

The question: What if the app grows from 5 screens → 200 screens?

Modular / Feature-Based Architecture

Each feature is self-contained — its own components, hooks, services, and tests.

/features
  /auth
    components/
    hooks/
    services/
    index.ts
  /dashboard
  /analytics
  /settings

Code Splitting

Only load what the user needs for the current route.

import { lazy, Suspense } from 'react';

const Dashboard = lazy(() => import('./features/dashboard'));

function App() {
  return (
    <Suspense fallback={<Spinner />}>
      <Dashboard />
    </Suspense>
  );
}

Micro-frontends (Large Orgs)

Independently deployable frontend pieces, each owned by a separate team. Useful at org scale (think: Spotify, Amazon). Not necessary for most apps — mention it as a valid extreme.

2.3 State Management Scalability

The question: What if state becomes deeply nested and complex?

Separation of Concerns

State Type	Tool
Server/async state	React Query / TanStack Query
Global UI state	Zustand, Redux Toolkit
Local component state	`useState`, `useReducer`

Avoid a single giant global store for everything.

Data Normalization

For large collections, normalize by ID to prevent duplication and enable O(1) lookups.

// Normalized structure (like a database)
const state = {
  users: {
    byId: {
      'u1': { id: 'u1', name: 'Alice' },
      'u2': { id: 'u2', name: 'Bob' },
    },
    allIds: ['u1', 'u2']
  }
};

// Access
const user = state.users.byId['u1'];

Slice-Based Architecture (Redux Toolkit)

Split the store into domain-specific slices, avoiding one massive reducer.

2.4 Network Scalability

The question: What if 1M users open the app simultaneously?

Caching Layers

Browser Cache (HTTP headers)
    ↓
Service Worker Cache
    ↓
React Query / TanStack Query Cache
    ↓
CDN Edge Cache
    ↓
Origin Server

Request Deduplication

Prevent multiple components from triggering the same API call simultaneously. React Query handles this out of the box.

Optimistic Updates

Update the UI immediately, then sync with the server. Improves perceived performance.

// React Query mutation with optimistic update
useMutation({
  mutationFn: updateTask,
  onMutate: async (newTask) => {
    // Cancel outgoing refetches
    await queryClient.cancelQueries(['tasks']);
    // Optimistically update the cache
    queryClient.setQueryData(['tasks'], (old) =>
      old.map(t => t.id === newTask.id ? newTask : t)
    );
  }
});

2.5 Rendering Scalability

The question: What if re-renders increase as app grows?

Problem	Solution
Prop drilling	Context or state manager
Parent re-renders cascading	`React.memo`, component splitting
Expensive calculations	`useMemo`
Unstable callback refs	`useCallback`
Large DOM trees	Component virtualization

Selective Subscriptions

With Zustand or Redux, subscribe only to the slice of state a component needs.

// Only re-renders when 'count' changes, not entire store
const count = useStore((state) => state.count);

2.6 Team & Organizational Scalability

The question: What if 50 frontend developers work on the same codebase?

Practice	Purpose
Monorepo (Turborepo / Nx)	Shared code, coordinated releases
Shared component library	Consistency, no duplication
Design system (Storybook)	UI isolation and documentation
TypeScript	Type safety, refactoring confidence
ESLint + Prettier	Enforced code style
CI/CD pipelines	Automated testing and deployment
Feature flags	Safe progressive rollouts

3. Full Scalability Checklist

Use this as a mental checklist during any frontend system design interview.

☐ Data Scalability
    ☐ Pagination / cursor-based fetching
    ☐ List virtualization
    ☐ Lazy loading images
    ☐ Data normalization in store

☐ Rendering Scalability
    ☐ React.memo on expensive components
    ☐ useMemo / useCallback where needed
    ☐ Component splitting
    ☐ Selective store subscriptions

☐ Codebase Scalability
    ☐ Feature-based folder structure
    ☐ Code splitting (routes + heavy components)
    ☐ Reusable component library

☐ State Management Scalability
    ☐ Server state vs UI state separated
    ☐ Normalized collections
    ☐ No giant global store

☐ Network Scalability
    ☐ HTTP caching headers
    ☐ CDN for static assets
    ☐ Request deduplication
    ☐ Optimistic updates
    ☐ Retry & background sync strategies

☐ Bundle Size Scalability
    ☐ Code splitting
    ☐ Tree shaking
    ☐ Dynamic imports for heavy libs
    ☐ Bundle analysis (Webpack Bundle Analyzer)

☐ Performance Under Heavy Interaction
    ☐ Debounce/throttle events
    ☐ Web Workers for heavy computation
    ☐ requestIdleCallback for non-urgent tasks

☐ Offline & Resilience
    ☐ Service Workers
    ☐ IndexedDB persistence
    ☐ Autosave with debounce
    ☐ Conflict resolution strategy

☐ Security & Access Scalability
    ☐ Role-based rendering
    ☐ Feature flags
    ☐ Permission guards
    ☐ Token refresh handling

☐ Team Scalability
    ☐ Monorepo
    ☐ TypeScript
    ☐ Storybook
    ☐ CI/CD

4. Asset Optimization

4.1 JavaScript Optimization

Reduce Bundle Size

// Route-level code splitting
const Analytics = React.lazy(() => import('./Analytics'));

// Dynamic import for heavy lib only when needed
const { jsPDF } = await import('jspdf');

Enable tree shaking (use ES modules, avoid CommonJS for utility libs)
Replace heavy libraries: dayjs instead of moment, date-fns instead of full luxon
Separate vendor chunks from app code for better long-term caching

Production Builds

Strip console.log statements
Dead code elimination via bundler (Vite, Webpack, Rollup)
Use correct NODE_ENV=production

4.2 CSS Optimization

PurgeCSS / Tailwind purge — remove unused CSS classes from final bundle
CSS Modules — scoped styles, no global conflicts
Minify CSS in production builds
Avoid massive global stylesheets — prefer scoped, component-level styles

4.3 Image Optimization

Images are often the single largest contributor to page weight.

Choose the Right Format

Format	Use Case
WebP	Default choice for photos
AVIF	Even smaller than WebP, modern browsers
SVG	Icons, illustrations, logos
PNG	When transparency + lossless needed

Responsive Images

<img
  src="image-small.jpg"
  srcset="image-small.jpg 480w, image-large.jpg 1024w"
  sizes="(max-width: 600px) 480px, 1024px"
  alt="Description"
/>

Lazy Loading

<!-- Native browser lazy loading -->
<img src="image.jpg" loading="lazy" alt="..." />

For JS-controlled: use IntersectionObserver.

Image CDN

Services like Cloudinary or Imgix can transform images on-the-fly (resize, format conversion, compression) without manual build steps.

4.4 Font Optimization

Fonts are a silent performance killer.

@font-face {
  font-family: 'MyFont';
  src: url('font.woff2') format('woff2'); /* Always prefer WOFF2 */
  font-display: swap; /* Prevents render blocking */
}

Load only the font weights you actually use (don't import 9 variants)
Use font-display: swap to prevent invisible text during load
Self-host fonts when possible to avoid third-party DNS lookups
Use <link rel="preload"> for critical fonts

4.5 Caching Strategy

Content Hashing

main.abc123.js   ← hash changes only when content changes
vendor.def456.js ← rarely changes, cached aggressively

Cache-Control Headers

# For versioned/hashed static assets
Cache-Control: public, max-age=31536000, immutable

# For HTML (always revalidate)
Cache-Control: no-cache

4.6 CDN Usage

Serve all static assets (JS, CSS, images, fonts) via a CDN
Edge nodes cache assets geographically close to users
Reduces latency and origin server load
Examples: Cloudflare, AWS CloudFront, Fastly

4.7 Critical Rendering Path

Optimize what the browser needs to render the first visible frame.

<!-- Inline critical above-the-fold CSS directly in <head> -->
<style>
  /* Only styles needed for first viewport */
  .hero { ... }
</style>

<!-- Defer non-critical JS -->
<script src="analytics.js" defer></script>

<!-- Async third-party scripts that don't need DOM -->
<script async src="ads.js"></script>

<!-- Preload critical resources -->
<link rel="preload" href="hero-font.woff2" as="font" crossorigin />

4.8 Third-Party Script Control

Third-party scripts (analytics, ads, chat widgets) are often the worst offenders.

Audit: Do we actually need this script?
Defer: Load after page is interactive
Idle load: Use requestIdleCallback for non-critical scripts

requestIdleCallback(() => {
  loadAnalytics();
});

Sandbox heavy embeds via <iframe> with sandbox attribute

5. Performance Checklist

5.1 Loading Performance (Initial Load)

Affects LCP (Largest Contentful Paint)

☐ Code split routes and heavy components
☐ Defer non-critical JS (defer / async)
☐ Use WebP/AVIF images
☐ Lazy load below-the-fold images
☐ Serve assets via CDN
☐ Enable Gzip / Brotli compression
☐ Preload critical fonts and assets
☐ Inline critical CSS

5.2 Rendering Performance (Runtime)

Affects smoothness and responsiveness.

// Memoize expensive components
const MemoCard = React.memo(Card);

// Memoize expensive calculations
const sorted = useMemo(() => sortItems(items), [items]);

// Stable callback references
const handleClick = useCallback(() => { ... }, [dep]);

☐ React.memo on components that receive the same props often
☐ useMemo for expensive derived data
☐ useCallback for callbacks passed as props
☐ Virtualize long lists (react-window)
☐ Avoid lifting state higher than needed
☐ Use store selectors to prevent broad re-subscriptions
☐ Keep DOM depth shallow

5.3 Interaction Performance

Affects INP (Interaction to Next Paint)

// Debounce search input
const debouncedSearch = debounce(fetchResults, 300);

// Throttle scroll handler
const throttledScroll = throttle(handleScroll, 100);

// Heavy computation off the main thread
const worker = new Worker('./heavy-task.js');
worker.postMessage({ data: largeDataset });
worker.onmessage = (e) => setResult(e.data);

☐ Debounce text inputs / search
☐ Throttle scroll / resize / mousemove events
☐ Move CPU-heavy tasks to Web Workers
☐ Use requestIdleCallback for non-urgent work
☐ Batch DOM reads/writes to avoid layout thrashing
☐ Avoid frequent style recalculations in JS

5.4 Network Performance

☐ Paginate or cursor-paginate large lists
☐ Deduplicate in-flight requests (React Query)
☐ Cache with HTTP headers (ETag, Cache-Control)
☐ Use Service Worker for offline caching
☐ Implement optimistic UI updates
☐ Debounce API calls from search/filter inputs
☐ Add retry logic with exponential backoff

5.5 Memory Performance

Often overlooked — critical for long-lived SPAs.

useEffect(() => {
  const controller = new AbortController();

  fetch('/api/data', { signal: controller.signal })
    .then(res => res.json())
    .then(setData);

  return () => controller.abort(); // Cleanup on unmount
}, []);

useEffect(() => {
  const id = setInterval(tick, 1000);
  return () => clearInterval(id); // Always clean up timers
}, []);

☐ Cancel fetch requests on component unmount (AbortController)
☐ Clear intervals and timeouts
☐ Unsubscribe from event listeners and observables
☐ Disconnect IntersectionObservers / MutationObservers
☐ Profile memory in Chrome DevTools Memory tab

5.6 Animations & Smoothness

/* ✅ Use GPU-composited properties */
.card {
  transform: translateY(0);
  opacity: 1;
  transition: transform 0.3s ease, opacity 0.3s ease;
}

/* ❌ Avoid layout-triggering properties in animation */
/* width, height, top, left, margin → cause layout recalc */

☐ Animate only transform and opacity (GPU-composited)
☐ Avoid animating width/height/top/left
☐ Use will-change sparingly and intentionally
☐ Prefer CSS animations over JS for simple transitions
☐ Use requestAnimationFrame for JS-driven animations

5.7 Core Web Vitals

Always mention these explicitly — it signals production awareness.

Metric	Measures	Target
LCP — Largest Contentful Paint	Load speed of main content	< 2.5s
CLS — Cumulative Layout Shift	Visual stability	< 0.1
INP — Interaction to Next Paint	Responsiveness to input	< 200ms

Monitoring tools:

Lighthouse — lab testing
Chrome DevTools Performance tab — profiling
Web Vitals library — real user monitoring
Sentry / Datadog RUM — production monitoring

6. Interview Answer Templates

"How would you make this scalable?"

"I'd think about scalability across multiple dimensions — not just traffic. For data scalability, I'd use virtualization and cursor-based pagination. For codebase scalability, modular feature-based architecture and code splitting. For state management, I'd separate server state from UI state and normalize large collections. For network scalability, layered caching and request deduplication. For team scalability, I'd introduce a shared design system, TypeScript, and CI/CD pipelines."

"How would you optimize frontend assets?"

"I'd focus on reducing bundle size through code splitting and tree shaking, optimize images by switching to WebP and adding lazy loading with responsive srcsets, minify CSS, implement content-hash-based caching with long max-age headers, serve everything through a CDN, defer non-critical scripts, and continuously monitor Core Web Vitals for regression."

"How would you ensure good performance?"

"I'd optimize at multiple layers — reduce bundle size for faster initial load, control re-renders with memoization and selective subscriptions to improve runtime performance, virtualize large datasets, cache API responses with React Query and HTTP headers, debounce heavy interactions, offload CPU-intensive work to Web Workers, and set up real-user monitoring on LCP, CLS, and INP."

7. Real-World Example: CodeSandbox-like App

When asked to design a browser-based code editor, apply this thinking:

Concern	Solution
Heavy editor library (Monaco)	Lazy load — don't bundle in initial chunk
Large file tree with 10k files	Virtualize with react-window
Compilation runs in browser	Web Worker — never block the main thread
User typing triggers re-renders	Debounce autosave (1–2s delay)
Project state on refresh	IndexedDB persistence via Service Worker
Multiple panels (editor, preview, console)	Split into isolated modules, load independently
Multiple users loading same npm packages	CDN + aggressive caching with content hashing
TypeScript checking in browser	Web Worker + incremental compilation

Sample answer structure:

"For a CodeSandbox-like app, I'd lazy load Monaco editor since it's several MB. The file tree would be virtualized. Compilation happens in a Web Worker to avoid freezing the UI. Autosave uses debounce to avoid hammering the server. The project state is persisted in IndexedDB so refreshes don't lose work. The editor, preview pane, and console are split into separate lazy-loaded modules so users only load what they need."

Use this document as a living reference. In interviews, pick the 2–3 most relevant dimensions for the specific system you're being asked to design — don't recite everything at once.