Skip to content

Instantly share code, notes, and snippets.

@ralphbean
Created April 2, 2026 14:38
Show Gist options
  • Select an option

  • Save ralphbean/c9d87252346950f301c5cbafa0f9df2e to your computer and use it in GitHub Desktop.

Select an option

Save ralphbean/c9d87252346950f301c5cbafa0f9df2e to your computer and use it in GitHub Desktop.

Revisions

  1. ralphbean created this gist Apr 2, 2026.
    192 changes: 192 additions & 0 deletions 0010-per-org-github-apps.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,192 @@
    ---
    title: "10. Per-org GitHub Apps for agent identity"
    status: Proposed
    relates_to:
    - agent-architecture
    - agent-infrastructure
    - security-threat-model
    topics:
    - security
    - credentials
    - identity
    - compute-agnosticism
    ---

    # 10. Per-org GitHub Apps for agent identity

    Date: 2026-04-02

    ## Status

    Proposed

    ## Context

    The admin CLI (branch `agent-admin-cli-clean-room-v4`) creates per-org GitHub
    Apps using the manifest flow — one app per agent role (fullsend, triage, coder,
    review) — and stores their private keys as repo secrets in the org's `.fullsend`
    repo. This is an implicit design decision that has not been explicitly justified
    or recorded.

    The standard GitHub App model is designed for multi-tenancy *without* per-org
    apps: a single app owner holds the private key, many orgs install the app, and
    the owner's backend mints scoped installation tokens per org. This is how
    Dependabot, Codecov, Renovate, and every GitHub Marketplace app works.

    The question is whether fullsend should follow that standard model (global
    shared apps) or require each adopting org to create its own apps.

    The answer depends on where agent workloads execute. Fullsend's design requires
    adopting orgs to run agent workloads on their own infrastructure. Today that
    means GitHub Actions; in the future it may mean Kubernetes clusters or other
    compute platforms the org controls. The credential mechanism must work across
    all of these.

    This ADR relates to [ADR 0009](0009-github-app-ephemeral-credentials.md), which
    addresses how ephemeral tokens are generated from app credentials. This ADR
    addresses the prior question: who owns the apps and where do the private keys
    live.

    ## Options

    ### Option 1: Per-org GitHub Apps (current implementation)

    Each adopting org creates its own set of GitHub Apps (one per agent role) during
    installation. The org holds the private keys and stores them in the `.fullsend`
    repo's secrets. Workflow runs in the org use these keys to mint short-lived
    installation tokens.

    **Pros:**
    - Compute-agnostic. The private key is a portable blob that can be stored in
    any secret store (GitHub Actions secrets, Kubernetes Secrets, Vault). Token
    minting is ~20 lines of code that runs identically on any platform.
    - Zero external dependencies. No central service, no availability concerns, no
    single point of failure.
    - True sovereignty. The org owns the apps, the keys, and the permissions. No
    trust relationship with a third party required.
    - Blast radius isolation. A compromise of one org's keys affects only that org.
    - Aligns with "the repo is the coordinator" — no external coordination layer.

    **Cons:**
    - Onboarding friction. The manifest flow requires browser interaction per app,
    multiplied by the number of agent roles (currently 4).
    - Permissions drift. When a new permission or event subscription is needed for
    an agent role, every org must update their apps individually. There is no
    centralized upgrade path.
    - Operational burden. Org admins manage multiple GitHub Apps (creation,
    installation, key rotation, deletion).
    - App name squatting. GitHub App slugs are globally unique. A naming collision
    (`fullsend-acme-coder`) blocks the legitimate org from using that slug.
    - Lost key recovery. GitHub App private keys are only available at creation
    time. If lost, the app must be deleted and recreated.

    ### Option 2: Global shared apps with reusable workflow token vending

    Fullsend owns one set of global GitHub Apps. The private keys are stored as
    secrets in the `fullsend-ai` org. A reusable workflow in the `fullsend-ai` org
    mints installation tokens for calling orgs via `workflow_call`.

    **Pros:**
    - Simple onboarding. Orgs install the app with one click; no manifest flow.
    - Centralized permission updates. New permissions are added once to the global
    apps.
    - No private key management for adopting orgs.

    **Cons:**
    - Coupled to GitHub Actions. `workflow_call` is a GitHub Actions primitive;
    Kubernetes pods and other compute platforms cannot call reusable workflows.
    - Reusable workflow outputs are visible in logs, potentially leaking tokens.
    - Single point of compromise. If the fullsend org's secrets leak, all adopting
    orgs are affected.
    - Centralized dependency. The fullsend org's workflows must be available for
    any adopting org to operate.

    ### Option 3: Global shared apps with OIDC token vending service

    Fullsend owns global apps and runs a stateless token vending service (e.g.,
    Cloudflare Worker, AWS Lambda). Callers present an OIDC token proving their
    identity; the service verifies it and returns a scoped installation token.

    **Pros:**
    - Simple onboarding. One-click app installation.
    - Stateless service. No database, trivially simple (~50 lines of code).
    - Self-hostable. Orgs can run their own instance.

    **Cons:**
    - Requires a running service — even a minimal one is infrastructure to operate,
    monitor, and secure.
    - Identity federation complexity. GitHub Actions OIDC tokens only prove GitHub
    Actions identity. Kubernetes uses different OIDC tokens with different trust
    roots. Each new compute platform requires a new identity attestation
    integration, creating an ever-growing compatibility matrix.
    - Network reachability. The service must be reachable from wherever compute
    runs, which may not be possible from air-gapped or firewalled clusters.
    - Single point of compromise if a shared instance is used.

    ### Option 4: Hybrid — global apps by default, per-org override

    Offer global shared apps as the default path, with an option for orgs that want
    full sovereignty to create their own apps.

    **Pros:**
    - Low friction for most adopters.
    - Sovereignty for those who need it.

    **Cons:**
    - Two code paths to maintain and test.
    - The global path still has the compute-coupling and centralization problems of
    Options 2 or 3.
    - Complexity of supporting both models may exceed the benefit.

    ## Decision

    Each adopting organization creates its own set of GitHub Apps — one per agent
    role — during fullsend installation (Option 1).

    The decisive factor is **compute-agnosticism**. Fullsend requires adopting orgs
    to run agent workloads on their own infrastructure. Today that is GitHub
    Actions; in the future it will include Kubernetes clusters and potentially other
    platforms. The credential mechanism must work identically regardless of where
    the compute runs.

    Per-org apps with org-held private keys satisfy this requirement cleanly:
    - The private key is a portable secret that can be stored in any platform's
    secret management system.
    - Token minting (sign JWT, exchange for installation token) is a simple,
    platform-independent operation.
    - No network reachability to a central service is required.
    - No platform-specific identity attestation (OIDC provider integration) is
    needed.

    All alternatives that avoid per-org apps require either a central service
    (Options 3, 4) or a GitHub Actions-specific mechanism (Option 2). These
    approaches would require new integrations for each additional compute platform,
    creating an ever-growing compatibility matrix. Per-org apps avoid this entirely.

    The "no managed service" framing understates the actual constraint. The real
    architectural requirement is that the credential mechanism must be
    **compute-agnostic by design** — a consequence of fullsend's bring-your-own-
    infrastructure model.

    ## Consequences

    - **Adopting orgs own their credential lifecycle.** They create apps, store
    keys, rotate credentials, and delete apps on their own terms. No dependency
    on fullsend infrastructure.
    - **Onboarding requires manifest flow interaction.** The admin CLI must guide
    org admins through creating and installing multiple GitHub Apps. This is
    more friction than a one-click marketplace install.
    - **No centralized permission upgrade path.** When agent roles need new
    permissions or event subscriptions, each org must update their apps. The
    admin CLI should provide tooling to detect and remediate permission drift
    (e.g., an `analyze` subcommand that compares installed app permissions
    against expected permissions).
    - **App slug collisions are possible.** The naming convention
    (`fullsend-{org}-{role}`) uses a global namespace. Mitigation: the CLI
    already supports slug overrides and known-slug mappings in config.
    - **Future compute platforms work automatically.** Moving from GitHub Actions
    to Kubernetes (or any other platform) requires only storing the existing
    private key in the new platform's secret store. No changes to the app
    model, no new service integrations, no identity federation.
    - **Blast radius is isolated per org.** A key compromise in one org cannot
    affect any other org.
    250 changes: 250 additions & 0 deletions conversation-per-org-github-apps.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,250 @@
    # Conversation: Challenging the per-org GitHub Apps assumption

    Date: 2026-04-02

    Participants: rbean, Claude Opus 4.6

    ## Context

    This conversation took place on the `agent-admin-cli-clean-room-v4` branch of
    the fullsend repo, which implements a CLI for org administrators to install
    fullsend. The CLI creates per-org GitHub Apps using the manifest flow — one app
    per agent role (fullsend, triage, coder, review) — and stores their private
    keys as repo secrets in the org's `.fullsend` repo.

    ---

    ## rbean

    I need you to do some research. The code on this branch rests on an implicit
    design decision (assumption!) about how github apps works. I want you to take a
    critical eye to that decision and challenge it. The assumption is that when an
    org administrator "installs" fullsend, they need to create their own github
    apps for use. Here's my chain of logic. We don't want to run a managed service.
    We want the org admins to run agent workloads on their infrastructure. since our
    only infra layer right now is github, that means github actions -- their github
    actions somewhere in their org. Those workflow runs need to act "as" the apps to
    comment on issues as them, submit PRs as them, submit code review as them. In
    order for that to work, the org administrator needs to have the private keys of
    the apps available as repo secrets in the .fullsend/ repo of their org (or
    somewhere in the org, anyways). And, (here's the implied challenge to the
    assumption..) if we were to have a single set of global github apps (one for
    each agent) that were shared globally by all orgs that use fullsend, then all of
    those orgs would need *copies* of the private keys associated with the apps, so
    that their workflow runs could mint short-term tokens for those apps, so that
    their workflow runs could act "as" those apps on repos in their org. Since we
    don't want to share those sensitive secrets broadly with anyone who wants to
    install fullsend - and such secret sharing is inadmissible - we're forced to
    have the installation process carried out by an org administrator create a set
    of apps scoped to their org that they have full administrative control over;
    they own and manage the private keys for those apps.

    ## Claude

    ### Challenging the "per-org apps" assumption

    Your reasoning chain has a **specific weak link** in step 3. Here it is,
    isolated:

    > "if we were to have a single set of global github apps... then all of those
    > orgs would need *copies* of the private keys"
    **This is not how GitHub Apps are designed to work.** The GitHub App
    installation model was built specifically for multi-tenancy *without* sharing
    private keys. Here's the standard flow:

    1. A developer (fullsend) creates **one** GitHub App and holds the private key
    2. Many orgs **install** it (granting permissions on their repos)
    3. The app owner's backend uses the private key to mint a JWT, then exchanges
    it for an **installation access token** scoped to a single org
    4. That token can only access repos in the org where the app is installed, with
    only the permissions granted during installation

    This is how every GitHub Marketplace app works — Dependabot, Codecov, Renovate,
    etc. None of them ask you to create your own copy of their app.

    ### Where your reasoning actually holds up

    Your logic becomes valid when you add the constraint **"we don't want to run a
    managed service."** That's the real load-bearing assumption — not the GitHub App
    architecture, but the infrastructure constraint.

    If workflows run in the org's GitHub Actions, and there's no central service to
    call, then the workflow itself needs to mint the installation token. To do that,
    it needs the private key. And sharing private keys is inadmissible. So per-org
    apps.

    The chain is really: **no managed service -> compute runs in org's Actions ->
    org needs private keys -> can't share private keys -> per-org apps.**

    ### Three concrete alternatives that challenge the assumption

    **1. Reusable workflows as a token vending proxy**

    This is the most interesting one because it requires **zero infrastructure
    outside GitHub**.

    A reusable workflow in the `fullsend-ai` org could:
    - Hold the global app private keys as org secrets
    - Be called by workflows in the target org via `workflow_call`
    - Mint a short-lived installation token for the caller's org
    - The calling workflow uses the token for the rest of its run

    The calling org's workflow looks like:
    ```yaml
    jobs:
    get-token:
    uses: fullsend-ai/.github/.github/workflows/mint-token.yml@main
    with:
    role: coder
    do-work:
    needs: get-token
    # use the token...
    ```

    **Problem:** Reusable workflow outputs are strings visible in logs. You'd need
    to work around this — perhaps by having the reusable workflow *do* the
    authenticated work directly (not just mint a token), or by using an
    intermediate mechanism. Also, the fullsend org becomes a single point of
    compromise: if its secrets leak, all orgs are affected.

    **2. OIDC + stateless token vending**

    GitHub Actions can mint OIDC tokens that cryptographically prove the identity
    of the workflow (org, repo, branch, workflow name). A stateless function
    (Cloudflare Worker, AWS Lambda) could:
    - Verify the OIDC token
    - Check that the calling org has installed the fullsend app
    - Mint and return a scoped installation token

    This is the pattern GitHub recommends for cloud auth (AWS, GCP, Azure all use
    it). The service is:
    - Stateless (no database)
    - Trivially simple (~50 lines of code)
    - Self-hostable (orgs paranoid about centralization can run their own)

    You could argue this "isn't really a managed service" in the way that running a
    database-backed SaaS is. But it's still *something* running somewhere, which
    violates the constraint as stated.

    **3. Hybrid: global apps with opt-in per-org override**

    Offer global shared apps as the default (easy onboarding,
    `actions/create-github-app-token` with a centralized token vending mechanism),
    but allow orgs to bring their own apps if they want full sovereignty. The
    install CLI already has most of the machinery for this.

    ### What per-org apps actually cost you

    The current approach isn't free. The costs are real and compound:

    1. **Permissions drift.** When you add a new event subscription or permission
    to an agent role (say, `coder` now needs `statuses: write`), every org has
    to update their app. With global apps, you update once. The code in
    `types.go:36-94` defines permissions per role — but those are only applied
    at creation time. There's no upgrade path.

    2. **Onboarding friction.** The manifest flow (`appsetup.go:268-385`) starts a
    local HTTP server, opens a browser, POSTs a form, waits for a callback, then
    requires a *second* browser interaction to install the newly created app.
    Multiply by 4 roles. Compare to: "click Install on the fullsend app page,
    select your org, done."

    3. **Operational burden.** The org admin now manages 4 GitHub Apps (create,
    install, rotate keys if compromised, delete on uninstall). The
    `handleExistingApp` method at `appsetup.go:216-251` already handles the "PEM
    is lost" failure mode — which wouldn't exist with global apps because the org
    never has the PEM.

    4. **App name squatting.** Since apps have global slug namespaces,
    `fullsend-acme-coder` is claimed by whoever creates it first. If org `acme`
    doesn't use fullsend but someone registers that slug, the real `acme` is
    stuck.

    ### What per-org apps buy you

    To be fair, the benefits are also real:

    1. **Zero external dependencies.** No service to run, no availability concerns,
    no SPOF. Pure GitHub-native.
    2. **True sovereignty.** The org owns the apps, the keys, the permissions. No
    trust in a third party.
    3. **Blast radius isolation.** A compromise of one org's keys affects only that
    org.
    4. **Aligns with "the repo is the coordinator."** No external coordination
    layer.

    ### Assessment

    The assumption is **defensible but not forced**. Your reasoning chain presents
    it as the only option given the constraints, but the "no managed service"
    constraint is doing all the work. The GitHub App model itself doesn't require
    per-org apps — it was specifically designed for the opposite.

    The question worth asking is: **Is "no managed service" a binary constraint, or
    a spectrum?** A stateless OIDC token vender is orders of magnitude simpler than
    "a managed service" in the traditional sense. And the reusable workflow approach
    requires zero infrastructure at all — it just shifts where the secrets live
    (fullsend org instead of each customer org).

    If the answer is truly "zero infrastructure, not even a Lambda, not even a
    reusable workflow holding shared secrets," then per-org apps is the correct
    conclusion. But that constraint should be stated explicitly in the design docs,
    because it's the actual decision — the per-org apps are a consequence, not a
    decision in themselves.

    ---

    ## rbean

    Does your assessment change if you consider that we almost certainly have a
    future requirement to be able to run our workloads not on github actions, but
    instead on other infrastructure - like, say, a kubernetes cluster where the orgs
    that have installed fullsend bring their own cluster?

    ## Claude

    Yes, significantly. It reinforces the per-org apps decision and undermines all
    three alternatives I raised.

    **Every alternative I proposed is tightly coupled to GitHub Actions as the
    compute platform:**

    1. **Reusable workflows**`workflow_call` is a GitHub Actions primitive. A
    Kubernetes pod can't call one. Dead on arrival.

    2. **OIDC token vending** — GitHub Actions OIDC tokens prove "I am a workflow
    running in org X, repo Y." A Kubernetes pod doesn't have a GitHub Actions
    OIDC token. You could use Kubernetes service account OIDC tokens instead, but
    now your vending service needs to understand multiple identity attestation
    mechanisms per platform, and you need to establish trust roots for each one.
    It's no longer "a trivial stateless function" — it's a real identity
    federation service.

    3. **Any centralized token minting** — needs to be network-reachable from
    wherever the compute runs. GitHub Actions has network access by default. A
    customer's air-gapped or firewalled Kubernetes cluster? Not necessarily.

    **Per-org apps with org-held private keys are compute-agnostic.** The private
    key is just a blob. The org stores it wherever their compute runs:

    - GitHub Actions -> repo/org secret
    - Kubernetes -> K8s Secret or Vault
    - Something else -> whatever secret store that platform uses

    The code to mint an installation token (`sign JWT with private key -> POST
    /app/installations/{id}/access_tokens`) is ~20 lines in any language and works
    identically regardless of where it executes. No platform-specific identity
    federation, no network reachability requirements to a central service, no
    coupling to GitHub's compute model.

    **The constraint isn't really "no managed service." It's "compute-agnostic by
    design."** And that's a much stronger justification for per-org apps than the
    one currently implicit in the code. The bring-your-own-compute requirement makes
    the alternatives not just inconvenient but architecturally unsound — each new
    compute platform would require a new identity attestation integration with the
    central service, turning it into an ever-growing compatibility matrix.

    Your assumption is correct, and for a better reason than the one you stated. I'd
    recommend making this explicit in the design docs — the per-org app model is a
    consequence of compute-agnosticism, not just secret-sharing aversion.