Skip to main content

What Runs Where

Canonical reference for Inqura's deployment topology. CD consults this at spec time. CC updates it when topology changes. The cycle-close naming reality check (NQU-866 P-9) consults this each Sunday.

Quick Reference

QuestionAnswer
Where does app.nquiry.ai serve from?AWS account 760007728097, resources named invapp-dev-* (R-8 rename to invapp-prod-* deferred per NQU-878)
Is there a separate staging environment?Yes — live since 2026-05-22 (NQU-865 Wave 2). JE-Vectors-test account 961381384763, resources invapp-staging-*, host staging.nquiry.ai.
Where does local dev run?Localhost via Docker-compose stack (NQU-867). Pre-NQU-867, local dev pointed at prod resources — see memory [[prod_dev_conflation]].
What model serves analysis?Bedrock Sonnet 4.6 (primary) + Haiku 4.5 (quality) + Titan v2 (embeddings) + Cohere Rerank 3.5. All us-east-1.
Where do DB migrations live?db/migrations/ — renamed from supabase/migrations/ on 2026-05-28 (NQU-826; this project does not use Supabase). Applied to each tier's RDS via npm run db:migrate (scripts/run-migration.ts, tracked by filename in _migrations); fresh installs bootstrap via bootstrap-db.sh.

Account Map

Production — AWS Account 760007728097

  • Provisioned: 2026-01-22 (initial Terraform scaffolding)
  • Role: Serves app.nquiry.ai (live traffic)
  • Identity: Single IAM user user/joe-dev with AdministratorAccess (R-3: to be split per environment in Phase 3)
  • Naming: All resources prefixed invapp-dev-* — historical artifact, see memory [[prod_dev_conflation]]. R-8 rename to invapp-prod-* was deferred 2026-05-22 via NQU-878 (re-surface triggers: account-level architecture churn, naming-confusion drift across 3 consecutive cycles, or external/audit flagging).
  • Region: us-east-1
  • Last verified: 2026-05-22 by CC Wave 2 deployment on NQU-865

Staging — AWS Account 961381384763 (JE-Vectors-test)

  • Provisioned: 2026-05-04 state backend (NQU-644 PR 4); app stack stood up 2026-05-22 (NQU-865 Wave 2 — 171 resources via terraform apply, 103 migrations, ECS 2/2 stable, deep-health 🟢 via staging.nquiry.ai)
  • Role: Pre-prod environment for migrations, flag flips, smoke suite, and promotion-gate dry-runs (per NQU-855 cycle-end gate + NQU-881 promotion workflow)
  • Identity: SSO via Identity Center user JE_Vectors (humans); bootstrap IAM user cc-staging-bootstrap (transitional, deletion deadline 2026-06-15 per NQU-871 Wave 4)
  • Profile: je-vectors-test
  • Naming: invapp-staging-* across all resources
  • Host: staging.nquiry.ai (behind CloudFront Basic Auth gate)
  • Region: us-east-1
  • Last verified: 2026-05-22 by CC Wave 2 deployment on NQU-865

Localhost — Contributor workstations

  • Role: CC's primary development environment
  • Stack: Docker-compose (Postgres + LocalStack S3 + Cognito-Local) — implementation in progress via NQU-867
  • Pre-NQU-867 state (historical, do not replicate): .env.local pointed DB_HOST / S3_* / COGNITO_* at production resources. Stripe live keys in .env.local. Every npm run dev invocation read/wrote production data. See memory [[prod_dev_conflation]].

Per-Account Resource Inventory

Production (760007728097) — current state

Resource ClassNames
ECSinvapp-dev-cluster / invapp-dev-app-service (1 task)
RDSinvapp-dev-postgres (db.t3.micro, single-AZ)
ALBinvapp-dev-alb
Redisinvapp-dev-redis-001 (cache.t3.micro)
ECRinvapp-dev-app
Cognitoinvapp-dev-users (us-east-1_C8qcVKKp5)
WAFinvapp-dev-waf
CloudFrontE21SLQWJ38EH4B (fronts app.nquiry.ai)
S3 (app data)investigation-app-dev-760007728097
S3 (other)invapp-dev-docs, invapp-dev-cloudtrail, retention manifest bucket, invapp-terraform-state
CloudWatch~20 alarms named invapp-dev-* (historical layered names like invapp-dev-prod-staging-health-failed have been cleaned up as of 2026-05-21)
Secrets Managerinvapp-dev/app-secrets, nquir/encryption-key, invapp-dev-basic-auth-credentials, nquir-linear-agents-secrets

Legacy names still present: some resources carry the old nquir-* brand (nquir/encryption-key, nquir-audit-manager-*, nquir-config-*). R-9 covers harmonizing these during the R-8 rename pass.

Staging (961381384763) — live since 2026-05-22

Resource ClassNames
ECSinvapp-staging-cluster / invapp-staging-app-service (1 task since NQU-1007, 2026-06-10 — was 2/2; scaled to match prod for promote-gate parity + ECS cost; container healthCheck on /api/health added same change)
RDSinvapp-staging-postgres (Multi-AZ, 103 migrations applied)
ALBinvapp-staging-alb-1289661186
Redisinvapp-staging-redis
ECRinvapp-staging-app (IMMUTABLE tags — SHA-only, no :latest)
Cognitoinvapp-staging-users (us-east-1_VMnVjCvJn / client 4oobpqkr9lb1kd7tc07oqf21ej)
WAFinvapp-staging-waf
CloudFrontFronts staging.nquiry.ai (Basic Auth gate — CloudFront Function, hardcoded creds)
S3 (state)invapp-terraform-state (Terraform backend)
Secrets Managerinvapp-staging/app-secrets (8 keys: DB, Redis, CRON, ANTHROPIC dummy, RESEND, Stripe test)
StripeTest-mode keys (sk_test / pk_test)

Promoted via the staging-deploy CI job (AWS_ROLE_ARN_STAGING OIDC role) on every push to main, parallel with the prod deploy job until NQU-881 lands the promotion gate.

Network Topology

Production data path (current)

User → app.nquiry.ai (Cognito-gated)

CloudFront E21SLQWJ38EH4B

invapp-dev-alb (ALB, account 760007728097)

invapp-dev-app-service (ECS Fargate, 1 task)
↓ (data fan-out)
├── invapp-dev-postgres (RDS db.t3.micro)
├── invapp-dev-redis-001 (ElastiCache)
├── investigation-app-dev-760007728097 (S3 — uploads, retention manifests, etc.)
├── invapp-dev-users Cognito User Pool (us-east-1_C8qcVKKp5)
└── Bedrock (account-scoped quotas, Sonnet 4.6 / Haiku 4.5 / Titan v2 / Cohere Rerank 3.5)

Cross-account access (live since 2026-05-22)

  • CI deploys to both staging (AWS_ROLE_ARN_STAGING) and prod (AWS_ROLE_ARN) via OIDC AssumeRole. Both run in parallel on every main push until NQU-881 lands the promotion gate.
  • Smoke suite (NQU-855) assumes least-privilege AWS_ROLE_ARN_STAGING_SMOKE in staging for Bedrock + Cognito access during the cycle-end gate.
  • E2E tests assume AWS_ROLE_ARN_E2E in the prod account (760007728097) — scoped to the subset of services e2e exercises (Cognito Admin*, S3 + KMS, Bedrock invoke + rerank, marketplace). NQU-865 W0-c part 1.
  • Marketplace seller-API cross-account (NQU-839, 2026-05-31): AWS Marketplace ResolveCustomer + GetEntitlements enforce seller-account-scoped authorization, so the call must originate from the account that owns the SaaS listing (prod, 760007728097). Staging (961381384763) assumes arn:aws:iam::760007728097:role/invapp-marketplace-cross-account for these calls. Trust side provisioned in infrastructure/terraform/environments/dev/marketplace-cross-account.tf; identity side (sts:AssumeRole on the staging ECS task role) provisioned in infrastructure/terraform/modules/ecs/main.tf gated by enable_marketplace_assume_role. Trust policy requires sts:ExternalId (NQU-945, 2026-06-01): the staging app passes MARKETPLACE_EXTERNAL_ID (from invapp-staging/app-secrets in 961381384763) on every AssumeRole call; the seller-account trust policy enforces a Condition.StringEquals."sts:ExternalId" match against the same value (supplied via -var marketplace_external_id at apply time). The two sides must stay in sync; rotation requires updating both. Prod runs natively in the seller account and does not use AssumeRole. Customer-greenfield has no marketplace surface.
  • Cost governance (NQU-946, 2026-06-01) — per-account: both prod (760007728097) and staging (961381384763) instantiate modules/cost-governance with their own per-account budgets (total/ECS/RDS/NAT), weekly cost-report Lambda, monthly resource-cleanup Lambda, EventBridge schedule rules, ComputeOptimizer enrollment, and SES email identity for ops+sns@nquiry.ai. All four AWS Budgets per account carry a LinkedAccount cost_filter scoped to the local account ID so they don't bleed each other's spend (a pre-NQU-946 gap that triggered the chronic "+78% over budget" alerts on prod after the NQU-865 W2 staging split). AWS Config recorder + rules live only in prod; staging passes enable_config_recorder=false + enable_config_rules=false because no Config recorder exists in 961381384763 yet.
  • VPC Interface Endpoints DISABLED pre-customer (NQU-946 step 3, 2026-06-02): the 5 interface endpoints we had (bedrock-runtime, logs, ssm, ssmmessages, secretsmanager) were destroyed in both accounts. May 2026 audit found USE1-VpcEndpoint-Hours=$72/account against USE1-VpcEndpoint-Bytes=$0 and NatGateway-Bytes=$0.19 — we were paying for endpoint availability without using the bandwidth. Net save: ~$144/month combined at steady state. Bedrock invoke, ECS task logs (CloudWatch), Secrets Manager fetch, and SSM sessions now route via NAT Gateway → public AWS service endpoints (TLS-encrypted; security posture is "AWS-API-over-public-internet" rather than "AWS-API-over-PrivateLink"). REVERSAL TRIGGER: flip enable_interface_endpoints back to true (or convert to a per-env variable) in infrastructure/terraform/modules/nquiry-stack/main.tf:60 and apply BEFORE the first real customer onboards OR before any FedRAMP / HIPAA audit kicks off. The terraform module retains the resource definitions; reversal is one PR + apply.
  • Bedrock quotas are independent per account — the core reason for the two-account topology. CC's experiments and smoke runs no longer share quota with live-user traffic.
  • Customer-deployed Inqura (licensed track per NQU-408) runs in customer-owned accounts; image pulls via ECR Public (planned, NQU-645 / NQU-646).

Identity Map

Current (single-account)

IdentityTypeScopeNotes
user/joe-devIAM userAdministratorAccessUsed for day-to-day CC ops AND live operations. R-3: split per env in Phase 3.
AWS_ROLE_ARN (CI)OIDC roleCI deploy onlyPartial OIDC adoption.
ECS task runtime (staging)ECS task roleinvapp-staging-ecs-task-role (961381384763)Keyless since 2026-05-22 — Bedrock / S3 / marketplace run on the task role alone. No static keys injected.
ECS task runtime (prod)ECS task roleinvapp-dev-ecs-task-role (760007728097)NQU-1006: static-key injections (IAM_ACCESS_KEY_ID / IAM_SECRET_ACCESS_KEY / BEDROCK_ACCESS_KEY_ID / BEDROCK_SECRET_ACCESS_KEY / deprecated ANTHROPIC_API_KEY) removed from the promote-to-prod task-def build; the task role already grants bedrock:InvokeModel* / Rerank / ApplyGuardrail / aws-marketplace:* / S3+KMS. Takes effect on the first promote after the filter lands. Key material retained in invapp-dev/app-secrets for trivial rollback.

Target (post-Phase 3)

TBD pending Phase 2 ultraplan question #1 (per-environment vs. per-service-per-environment role granularity).

External Services

ServiceProductionStaging (planned)Local dev (post-NQU-867)
StripeLive keys (sk_live / pk_live) in Secrets Manager onlyTest keys (sk_test / pk_test) in invapp-staging/app-secrets (live since Wave 2)Test keys in .env.local (NQU-867)
BedrockAccount-scoped quota on 760007728097 — Sonnet 4.6 + Haiku 4.5 + Titan v2 + Cohere Rerank 3.5Account-scoped quota on 961381384763 — Sonnet 4.6 us. RPM raised (L-CCA5DF70 → 10001); Haiku 4.5 us. RPM raise pending; others available out-of-boxLocalhost should target Docker-compose stack (NQU-867); historical pattern of hitting prod is the failure mode the two-account topology + NQU-867 eliminate
Anthropic-directRemoved (NQU-783 / ADR-0002, 2026-05-18). Bedrock is the only inference path; resilience via multi-region fallback (us-east-1 → us-west-2) + throttle shedder in lib/ai/client.tsRemoved (same)n/a (Bedrock only)
SentryAll envs (cross-account-friendly)TBDTBD
LinearAll envsn/a (dev-tooling, not env-scoped)n/a
GitHub ActionsSecrets all live-account-scoped (R-6: split per environment in Phase 3)TBDn/a

Current State vs. Target State

Most rows below moved from "Target" to "Current" on 2026-05-22 (NQU-865 Wave 2). The promotion-gate row remains a target until NQU-881 lands the workflow_dispatch gate in Cycle 2 (due 2026-05-31).

ItemCurrent (2026-05-22+)Target / pending
AWS accounts in use2 (prod 760007728097 + staging 961381384763) ✅
Prod resource naminginvapp-dev-* (R-8 rename deferred per NQU-878 re-surface triggers)invapp-prod-* if a re-surface trigger fires
IAM identity for opsPer-account: joe-dev (prod), SSO JE_Vectors (staging), bootstrap cc-staging-bootstrap (transitional)Delete cc-staging-bootstrap by 2026-06-15 (NQU-871 Wave 4)
Bedrock quota isolationAccount-scoped, separate quotas ✅
CI credential setPer-environment OIDC roles (AWS_ROLE_ARN, AWS_ROLE_ARN_STAGING, AWS_ROLE_ARN_STAGING_SMOKE, AWS_ROLE_ARN_E2E) ✅
Local dev resourcesDocker-compose stack (NQU-867 in flight)Same
Promotion pathmain push deploys to BOTH staging and prod in parallel; cycle-end smoke (NQU-855) runs against staging at cycle closeNQU-881 (Cycle 2, due 2026-05-31) — remove deploy from push trigger; promote via workflow_dispatch
Pre-prod bufferStaging environment IS the buffer ✅; prod still hardened by CloudFront WAF + Cognito

How to Use This Doc

For CD spec-writing

When writing a spec that mentions environments, promotion paths, or AWS resources, consult this doc and cite the verified topology in the spec header. Do NOT use generic "production/staging/promote" terms without anchors. See memory [[cd_spec_topology_citations]].

For CC topology-changing PRs

When making a PR that changes the topology — rename, account split, new resource class, IAM change, new external service, env-var change that affects which environment a runtime points at — update this doc in the SAME PR. The doc and the topology should never drift. Update the last-verified: frontmatter to the PR date.

For the cycle-close naming reality check

The scheduled task cycle-close-naming-reality-check (NQU-866 P-9) consults this doc each Sunday to verify spec/comment language across the week matches verified topology.

Ratified Maintenance Decisions (Joe, 2026-05-21)

  1. Cross-agent access to this doc — (a) preferred, conditional on mount-add cost. Joe ratified option (a): nquiry-workspace gets surfaced to CC's Cowork mount so CC has direct read/write access to this doc. Topology-changing PRs from CC then include doc updates inline. Fallback to (c) (CC posts updates as Linear comments, CD applies) if CC reports that adding the workspace to its mount is non-trivial. CD pending CC's answer to the mount-cost question.

  2. Last-verified cadence — ratified. Re-verify on every topology-changing PR (mandatory) AND at each cycle close via the P-9 scheduled task (cycle-close-naming-reality-check). Re-verify means: spot-check 2–3 entries against the live account and update the last-verified: frontmatter.

  3. Predecessor doc status — ratified historical. Both predecessors marked historical on 2026-05-21:

    • investigation-app/docs/reference/process/environment-strategy.md — frontmatter status: historical, superseded-by: investigation-app/docs/admin/ops/what-runs-where.md, in-body banner.
    • investigation-app/docs/decisions/2026-04-03-staging-to-production-transition.mdStatus: Historical — superseded 2026-05-21 by NQU-865/866/868, in-body banner noting this file is the canonical example of the doc-vs-decision drift pattern from the NQU-866 RCA.

    Architecture-of-record going forward is this file. Edits to the predecessors should not happen; edit here instead.

References

  • NQU-865 — Stand up real prod isolation (the work this doc describes the target of)
  • NQU-866 — RCA prod/dev conflation (the incident that motivated this doc)
  • NQU-867 — Local dev isolation — rotate Stripe + Docker-compose
  • NQU-855 — Cycle-end integration gate (smoke suite consumes the promotion path described here)
  • NQU-868 — This doc's creation ticket
  • NQU-869 — CC audit-routine default (full-account scope)
  • NQU-728 — Bedrock Sonnet 4.6 migration (incident enabled by the conflation)
  • Memory: [[prod_dev_conflation]], [[cd_spec_topology_citations]], [[deferrals_require_resurface]], [[over_deferral_meta_pattern]], [[joe_instincts_are_data]], [[je_vectors_test_account]], [[cd_to_cc_handoff_via_linear]]
  • Predecessor docs (now superseded by this doc, pending ratification of Open Q-3 above):
    • investigation-app/docs/ops/environment-strategy.md — first to document "dev is actually prod" (2026-03-27)
    • investigation-app/docs/decisions/2026-04-03-staging-to-production-transition.md — Gate-1 ADR (Status: Proposed; never executed Phase 3 / Decision 7 Resource Rename)

v1 by CD, 2026-05-21, synthesizing CC's NQU-865 Phase 1 inventory + the locked Phase 1 ADR + NQU-867 scope. Pending Joe's review and ratification of the three Open Maintenance Questions above.