Skip to content

Instantly share code, notes, and snippets.

@marcobazzani
Last active May 29, 2025 11:44
Show Gist options
  • Select an option

  • Save marcobazzani/effefd6c86feaef0c37b570550ab65b0 to your computer and use it in GitHub Desktop.

Select an option

Save marcobazzani/effefd6c86feaef0c37b570550ab65b0 to your computer and use it in GitHub Desktop.

Internal Build and Deployment System

Large-scale, multi-repository development is managed by a unified build and deployment framework that enforces modularity, consistency, and reproducibility.  All code is organized into discrete packages (each typically a separate repository or module) with a configuration file describing its build process, produced artifacts, and dependencies .  Each package has a stable interface version (e.g. “1.0”) that developers bump only for incompatible API changes, while each successful build produces a unique build version (often including a timestamp) that is automatically recorded.  Developers never manage build versions by hand – they always depend on the interface version, and the system’s version set resolves that to the exact build artifact .  This separation ensures backward compatibility: changes within an interface version are guaranteed non-breaking, and if a change is breaking, the interface version is incremented .

Each package’s config lists its dependencies, including compile-time, runtime, and test libraries.  Dependencies are declared explicitly and scoped correctly: build-tool dependencies (like compilers or build scripts) are versioned but not included on runtime classpaths; compile-time dependencies are needed to compile or link; and runtime dependencies are what get deployed.  Test-only dependencies are kept separate.  The build system enforces a clean environment (only declared dependencies plus a minimal bootstrap toolchain are available) so that no hidden libraries sneak into the classpath .  This guarantees reproducible builds: any build run with the same version set and source code will produce identical artifacts .

Version Sets and Dependency Management

At the core is the concept of a version set, which is a named snapshot of consistent package versions.  A version set is essentially a locking mechanism: it pins every package in the dependency graph (the full dependency closure) to specific build versions.  There is typically a default global set (often called “live” or “main”) that contains widely-used libraries and shared code, and teams create their own version sets that branch from or import the global set.  When you build a package, you choose a version set to build against – this means all of that package’s dependencies (by interface version) are resolved to the specific build versions recorded in that version set .

Each build against a version set updates it via a new version-set event.  Concretely, a build request might specify package X and version set Y.  The build system takes package X’s code and, using the dependency versions from Y, compiles it and runs tests.  Only if all requested packages in that build (often one package, or multiple in a coordinated build) succeed are the new artifacts published and the version set updated with their new build versions .  This creates a new “event” in the version set (often denoted by a timestamp or sequence number) that points to the parent event.  Builds on the same version set are serialized by default to prevent race conditions, though independent branches of the dependency graph can run in parallel as long as they don’t overlap .

Version sets can be merged or imported to update dependencies.  For example, a team’s version set may periodically import the latest from the global “live” set: this pulls in new package versions and triggers rebuilds of anything that depends on them, producing a new version-set event that refreshes the closure.  Version sets also allow hierarchical layering: an organization might have a top-level set for core libraries, per-team sets for each service, and even short-lived sets for experimental branches.  Each version set is anchored by one or more root packages (e.g. the main services), and any dependencies that fall out of the closure are automatically pruned from the set .  This hierarchy lets teams upgrade their dependencies on their own schedule while still having a stable baseline and the ability to merge shared changes when needed .

Local Development Workspaces

Individual developers work in workspaces that tie into version sets to maintain consistency on the desktop.  A workspace is essentially a local namespace with an associated version set.  Developers can check out (edit) any number of package repositories in that workspace.  Checked-out packages override the versions in the version set: the build uses the local copy for that package, while all other dependencies are pulled from the version set.  In practice, a developer might have a “team” workspace containing all of their team’s projects, plus smaller workspaces for single projects or experimental features.  Because each workspace is namespaced to the user’s machine, multiple workspaces (and versions of dependencies) can coexist without conflict .  This ensures that running a local build or test has the same dependency closure as the CI build, while still allowing one to work on multiple packages simultaneously.  In other words, a workspace preserves multi-package consistency locally: all dependencies come from the chosen version set except those intentionally overridden by the developer .

Workspaces may also be used to test deployment overrides: for example, a developer could deploy an application to a local environment and selectively override one library or component to see how it behaves, without updating the entire version set.  This is a convenience feature for development but is noted as not strictly reproducible, since it departs from the locked version-set closure .

Build System and Reproducibility

The build system itself is data-driven and integrated with the version control system.  Each package’s configuration file (often a simple JSON/YAML or domain-specific format) lists its language/toolchain, interface version, and dependencies.  The build driver constructs a directed acyclic graph (DAG) of build steps according to the config.  It retrieves the exact package versions from the selected version set (or workspace overrides) and then invokes the language-native build tools (e.g. npm/yarn for JavaScript, make/cmake for C/C++, pip/setuptools for Python, etc.) to compile or package the code .

Importantly, the build system ensures purity: it boots with only a minimal runtime (just the compiler/interpreter and basic OS) and fetches every declared dependency explicitly.  This prevents “works on my machine” problems.  Any accidental dependency on a library outside the declared list would break the controlled build, causing an error.  This strict isolation ensures that every build (local or in CI) is fully reproducible .  After a successful build, the system publishes the artifacts (binaries, JARs, static assets, etc.) to a central artifact repository or storage (often backed by object storage) and retains them indefinitely for auditing and rollback.  All artifacts include metadata linking them back to the exact VCS commits and version-set event that produced them .

A centralized build service (sometimes called a “package builder”) automates this process.  Developers can manually kick off builds against a version set or configure the service to build on each commit to the main branch.  The build service works across supported platforms (Linux x86, ARM, etc.) and guarantees consistency: each build runs in a snapshot of the chosen version set, and once complete, the output artifact version is recorded .  By snapshotting the build environment and artifact, the system enforces end-to-end reproducibility.

Deployment Framework and Pipeline

Built artifacts are deployed by a separate deployment orchestrator through a stage-based pipeline (e.g. development → staging → production environments).  Deployment is always done per version-set event, ensuring the entire application (and all its services and libraries) comes from one consistent set of package versions .  The deployment system pulls the artifacts for that version set and prepares them for hosting.  Typically, it uses shared network storage and assembles “symlink trees” for each environment: for example, an application version might reside under /deploy/env/name/lib, /deploy/env/name/bin, etc., with symlinks arranged so processes see only that version’s libraries .  Each environment is isolated by name (similar to container namespaces), so multiple versions of the same service can coexist on a host (in different environments) without conflict .

Zero-downtime updates are achieved by blue-green or rolling strategies. For instance, when releasing a new version of a service, the orchestrator can deploy it into a parallel environment (blue) while the old version (green) is still running, switch traffic after health checks, and then retire the old environment.  Alternatively, it can incrementally roll updates to service instances one by one.  After deployment, the system runs any defined lifecycle scripts (startup or shutdown hooks) in version control order , so that the precise procedure for bringing the service up or down is itself versioned and repeatable.

Stage-based pipelines allow coordinated releases. For example, a new full-stack release (frontend + backend) is first deployed to a testing environment where integration tests run.  Once validated, the same version-set deployment is promoted to staging and, finally, to production.  Because only complete version sets are deployed and all component versions in that set were built and tested together, cross-service consistency is guaranteed.  The deployment system also supports quick rollbacks: if a problem is detected, it can redeploy the previous version-set event, since all artifacts remain available .

Additional Mechanisms

Several internal mechanisms further enforce consistency:

  • Interface vs. Build Versioning: As noted, dependencies are declared by interface version, but the system internally records a build version for each package.  For example, a package with interface “1.0” might produce an artifact “1.0.1593214096” (where the trailing number is a timestamp or build counter).  The version set stores the build version, but developers only think in terms of the interface.  This implicit contract means all changes within “1.0” are backward-compatible, and incompatibilities require bumping the interface (to “1.1”) .

  • Explicit Dependency Scopes: The system correctly separates build, compile, runtime, and test dependencies .  Build tools (e.g. the compiler or linker) are versioned packages themselves.  Runtime libraries are only deployed with the service, not needed during compilation, and test libraries are only used during test phases.  Transitive dependencies are fully resolved through the version set.  When preparing a deployment, test dependencies are stripped out, ensuring only the needed runtime libraries are delivered .

  • Immutable Artifacts and Traceability: Every published artifact is immutable and linked to the exact source commit.  Users viewing a deployed service can inspect metadata showing the source commit ID and version-set event used for that deployment .  This audit trail is critical for debugging and compliance.

  • Rebuild on Demand: Because the build process is scripted and version-driven, any historical release can be rebuilt.  Since every dependency was pinned in a version set, reproducing a past version-set event simply means checking out the old configs and running the build.  This is especially useful for releases that need security patches or backport fixes.

Example: Coordinated Full-Stack Application

Imagine a team developing a full-stack application composed of: (1) a JavaScript frontend (e.g. a React single-page app) and (2) a Python backend (e.g. a Flask service).  The frontend and backend are separate packages with their own dependencies (npm packages for the frontend, PyPI packages for the backend).

  1. Local Development: A developer creates a new workspace for the project, anchored to the team’s version set (say “teamX-frontend” and “teamX-backend”, or a combined “teamX-fullstack” set).  They check out both repositories into this workspace.  Locally, any package checked out (frontend and backend) will override the versions in the version set.  When the developer runs the build tool, it reads each package’s config and pulls in dependencies from the version set (for example, a UI library or common API client for the frontend, and a database driver for the backend).  The build tool then invokes npm install and npm run build for the frontend, and pip install plus any packaging step for the backend .  Because the workspace is tied to the version set, both builds use the exact same versions of shared libraries (e.g. if both depend on a common authentication client library, the version set ensures they get the same version).  The developer runs unit and integration tests for each part.  Workspaces make sure tests on the frontend see the correct mock services, and tests on the backend use the right configs, without any hidden mismatches .

  2. Commit and Build: The developer commits changes to both the frontend and backend repos.  The central build service detects the new commits and triggers a coordinated build.  The team’s version set (initially seeded from the global defaults) is selected.  The build service simultaneously rebuilds the frontend package and the backend package against this version set .  Because this is a coordinated release, the system waits for both builds (and their tests) to succeed.  Suppose the frontend had interface version 1.0 and the backend 1.0; the builds generate artifacts frontend-1.0.1593214096 and backend-1.0.1593214097, for example.  Only after both succeed are the artifacts published and the version set updated with these new build versions .

  3. Version Set Update: The version set now has a new event (e.g. “teamX-fullstack@1620000000”) recording that the frontend 1.0.1593214096 and backend 1.0.1593214097 are the latest.  All future builds against this set will use these exact versions unless updated.  Any transitive dependencies brought in by these builds (e.g. npm libraries or Python libs) are also now part of the dependency closure recorded in the event.  If, for instance, the frontend pulled in React 17.0, that specific build version of React is locked in the version set too.  Because the builds passed successfully with this closure, consistency is assured.

  4. Deployment: With the new version-set event ready, the deployment pipeline is invoked.  First, the orchestrator deploys the same version-set event to a staging environment.  It retrieves frontend-1.0.1593214096, backend-1.0.1593214097, and all other needed packages (e.g. a shared database connector library) from the artifact store.  The frontend static files are placed in the staging web environment (perhaps under /deploy/staging/fullstack-{timestamp}/), and the backend service is deployed to the staging servers (in its own environment directory).  The orchestrator builds the directory trees and symlinks so that each component only sees its intended library versions .  Startup scripts (part of each package’s deployment config) run to launch the services.  At this point, end-to-end staging tests run against the combined system.  Because both frontend and backend came from the same version set event, there are no mismatched versions of shared code or contract; they were built together for this release.

  5. Production Rollout: After validation, the orchestrator performs a zero-downtime rollout to production.  For example, it could deploy the new backend into a parallel (blue) production environment while the old (green) one is still handling live traffic.  Once healthy, traffic is switched over to the new backend.  Then the frontend is updated, again in a zero-downtime manner (e.g. swapping symlinks or rolling out to web servers behind a CDN).  Alternatively, if deploying containerized services, the orchestrator can do rolling updates across instances.  In every case, the key is that all new processes use the version-set-locked artifacts.  If any problems occur, the system can quickly roll back by reactivating the previous version-set event, since both versions’ artifacts remain stored.

Throughout this flow, reproducibility and traceability are maintained.  If at any point the team needs to reproduce the exact build environment (for a bug or hotfix), they simply rebuild package X against the recorded version-set event (perhaps choosing an older event) and they will get byte-for-byte identical results .  The deployment metadata will show exactly which version-set event and source commits were used, making audits and rollback decisions straightforward.

In summary, this system uses packages as modular units, version sets to lock dependency closure, and workspaces to give developers consistent local environments.  The build process is orchestrated for reproducibility (no hidden state, snapshot builds ), and the deployment framework stages releases safely with zero-downtime strategies.  Together, these elements ensure that even as hundreds of services evolve independently, each integrated release is reproducible, testable, and deployable end-to-end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment