What Runs Where
Canonical reference for Inqura's deployment topology. CD consults this at spec time. CC updates it when topology changes. The cycle-close naming reality check (NQU-866 P-9) consults this each Sunday.
Quick Reference
| Question | Answer |
|---|---|
| Where does app.nquiry.ai serve from? | AWS account 760007728097, resources named invapp-dev-* (R-8 rename to invapp-prod-* deferred per NQU-878) |
| Is there a separate staging environment? | Yes — live since 2026-05-22 (NQU-865 Wave 2). JE-Vectors-test account 961381384763, resources invapp-staging-*, host staging.nquiry.ai. |
| Where does local dev run? | Localhost via Docker-compose stack (NQU-867). Pre-NQU-867, local dev pointed at prod resources — see memory [[prod_dev_conflation]]. |
| What model serves analysis? | Bedrock Sonnet 4.6 (primary) + Haiku 4.5 (quality) + Titan v2 (embeddings) + Cohere Rerank 3.5. All us-east-1. |
| Where do DB migrations live? | db/migrations/ — renamed from supabase/migrations/ on 2026-05-28 (NQU-826; this project does not use Supabase). Applied to each tier's RDS via npm run db:migrate (scripts/run-migration.ts, tracked by filename in _migrations); fresh installs bootstrap via bootstrap-db.sh. |
Account Map
Production — AWS Account 760007728097
- Provisioned: 2026-01-22 (initial Terraform scaffolding)
- Role: Serves app.nquiry.ai (live traffic)
- Identity: Single IAM user
user/joe-devwithAdministratorAccess(R-3: to be split per environment in Phase 3) - Naming: All resources prefixed
invapp-dev-*— historical artifact, see memory[[prod_dev_conflation]]. R-8 rename toinvapp-prod-*was deferred 2026-05-22 via NQU-878 (re-surface triggers: account-level architecture churn, naming-confusion drift across 3 consecutive cycles, or external/audit flagging). - Region: us-east-1
- Last verified: 2026-05-22 by CC Wave 2 deployment on NQU-865
Staging — AWS Account 961381384763 (JE-Vectors-test)
- Provisioned: 2026-05-04 state backend (NQU-644 PR 4); app stack stood up 2026-05-22 (NQU-865 Wave 2 — 171 resources via terraform apply, 103 migrations, ECS 2/2 stable, deep-health 🟢 via staging.nquiry.ai)
- Role: Pre-prod environment for migrations, flag flips, smoke suite, and promotion-gate dry-runs (per NQU-855 cycle-end gate + NQU-881 promotion workflow)
- Identity: SSO via Identity Center user
JE_Vectors(humans); bootstrap IAM usercc-staging-bootstrap(transitional, deletion deadline 2026-06-15 per NQU-871 Wave 4) - Profile:
je-vectors-test - Naming:
invapp-staging-*across all resources - Host:
staging.nquiry.ai(behind CloudFront Basic Auth gate) - Region: us-east-1
- Last verified: 2026-05-22 by CC Wave 2 deployment on NQU-865
Localhost — Contributor workstations
- Role: CC's primary development environment
- Stack: Docker-compose (Postgres + LocalStack S3 + Cognito-Local) — implementation in progress via NQU-867
- Pre-NQU-867 state (historical, do not replicate):
.env.localpointedDB_HOST/S3_*/COGNITO_*at production resources. Stripe live keys in.env.local. Everynpm run devinvocation read/wrote production data. See memory[[prod_dev_conflation]].
Per-Account Resource Inventory
Production (760007728097) — current state
| Resource Class | Names |
|---|---|
| ECS | invapp-dev-cluster / invapp-dev-app-service (1 task) |
| RDS | invapp-dev-postgres (db.t3.micro, single-AZ) |
| ALB | invapp-dev-alb |
| Redis | invapp-dev-redis-001 (cache.t3.micro) |
| ECR | invapp-dev-app |
| Cognito | invapp-dev-users (us-east-1_C8qcVKKp5) |
| WAF | invapp-dev-waf |
| CloudFront | E21SLQWJ38EH4B (fronts app.nquiry.ai) |
| S3 (app data) | investigation-app-dev-760007728097 |
| S3 (other) | invapp-dev-docs, invapp-dev-cloudtrail, retention manifest bucket, invapp-terraform-state |
| CloudWatch | ~20 alarms named invapp-dev-* (historical layered names like invapp-dev-prod-staging-health-failed have been cleaned up as of 2026-05-21) |
| Secrets Manager | invapp-dev/app-secrets, nquir/encryption-key, invapp-dev-basic-auth-credentials, nquir-linear-agents-secrets |
Legacy names still present: some resources carry the old nquir-* brand (nquir/encryption-key, nquir-audit-manager-*, nquir-config-*). R-9 covers harmonizing these during the R-8 rename pass.
Staging (961381384763) — live since 2026-05-22
| Resource Class | Names |
|---|---|
| ECS | invapp-staging-cluster / invapp-staging-app-service (1 task since NQU-1007, 2026-06-10 — was 2/2; scaled to match prod for promote-gate parity + ECS cost; container healthCheck on /api/health added same change) |
| RDS | invapp-staging-postgres (Multi-AZ, 103 migrations applied) |
| ALB | invapp-staging-alb-1289661186 |
| Redis | invapp-staging-redis |
| ECR | invapp-staging-app (IMMUTABLE tags — SHA-only, no :latest) |
| Cognito | invapp-staging-users (us-east-1_VMnVjCvJn / client 4oobpqkr9lb1kd7tc07oqf21ej) |
| WAF | invapp-staging-waf |
| CloudFront | Fronts staging.nquiry.ai (Basic Auth gate — CloudFront Function, hardcoded creds) |
| S3 (state) | invapp-terraform-state (Terraform backend) |
| Secrets Manager | invapp-staging/app-secrets (8 keys: DB, Redis, CRON, ANTHROPIC dummy, RESEND, Stripe test) |
| Stripe | Test-mode keys (sk_test / pk_test) |
Promoted via the staging-deploy CI job (AWS_ROLE_ARN_STAGING OIDC role) on every push to main, parallel with the prod deploy job until NQU-881 lands the promotion gate.
Network Topology
Production data path (current)
User → app.nquiry.ai (Cognito-gated)
↓
CloudFront E21SLQWJ38EH4B
↓
invapp-dev-alb (ALB, account 760007728097)
↓
invapp-dev-app-service (ECS Fargate, 1 task)
↓ (data fan-out)
├── invapp-dev-postgres (RDS db.t3.micro)
├── invapp-dev-redis-001 (ElastiCache)
├── investigation-app-dev-760007728097 (S3 — uploads, retention manifests, etc.)
├── invapp-dev-users Cognito User Pool (us-east-1_C8qcVKKp5)
└── Bedrock (account-scoped quotas, Sonnet 4.6 / Haiku 4.5 / Titan v2 / Cohere Rerank 3.5)
Cross-account access (live since 2026-05-22)
- CI deploys to both staging (
AWS_ROLE_ARN_STAGING) and prod (AWS_ROLE_ARN) via OIDCAssumeRole. Both run in parallel on everymainpush until NQU-881 lands the promotion gate. - Smoke suite (NQU-855) assumes least-privilege
AWS_ROLE_ARN_STAGING_SMOKEin staging for Bedrock + Cognito access during the cycle-end gate. - E2E tests assume
AWS_ROLE_ARN_E2Ein the prod account (760007728097) — scoped to the subset of services e2e exercises (Cognito Admin*, S3 + KMS, Bedrock invoke + rerank, marketplace). NQU-865 W0-c part 1. - Marketplace seller-API cross-account (NQU-839, 2026-05-31): AWS Marketplace
ResolveCustomer+GetEntitlementsenforce seller-account-scoped authorization, so the call must originate from the account that owns the SaaS listing (prod, 760007728097). Staging (961381384763) assumesarn:aws:iam::760007728097:role/invapp-marketplace-cross-accountfor these calls. Trust side provisioned ininfrastructure/terraform/environments/dev/marketplace-cross-account.tf; identity side (sts:AssumeRoleon the staging ECS task role) provisioned ininfrastructure/terraform/modules/ecs/main.tfgated byenable_marketplace_assume_role. Trust policy requiressts:ExternalId(NQU-945, 2026-06-01): the staging app passesMARKETPLACE_EXTERNAL_ID(frominvapp-staging/app-secretsin 961381384763) on every AssumeRole call; the seller-account trust policy enforces aCondition.StringEquals."sts:ExternalId"match against the same value (supplied via-var marketplace_external_idat apply time). The two sides must stay in sync; rotation requires updating both. Prod runs natively in the seller account and does not use AssumeRole. Customer-greenfield has no marketplace surface. - Cost governance (NQU-946, 2026-06-01) — per-account: both prod (760007728097) and staging (961381384763) instantiate
modules/cost-governancewith their own per-account budgets (total/ECS/RDS/NAT), weekly cost-report Lambda, monthly resource-cleanup Lambda, EventBridge schedule rules, ComputeOptimizer enrollment, and SES email identity forops+sns@nquiry.ai. All four AWS Budgets per account carry aLinkedAccountcost_filter scoped to the local account ID so they don't bleed each other's spend (a pre-NQU-946 gap that triggered the chronic "+78% over budget" alerts on prod after the NQU-865 W2 staging split). AWS Config recorder + rules live only in prod; staging passesenable_config_recorder=false+enable_config_rules=falsebecause no Config recorder exists in 961381384763 yet. - VPC Interface Endpoints DISABLED pre-customer (NQU-946 step 3, 2026-06-02): the 5 interface endpoints we had (
bedrock-runtime,logs,ssm,ssmmessages,secretsmanager) were destroyed in both accounts. May 2026 audit foundUSE1-VpcEndpoint-Hours=$72/accountagainstUSE1-VpcEndpoint-Bytes=$0andNatGateway-Bytes=$0.19— we were paying for endpoint availability without using the bandwidth. Net save: ~$144/month combined at steady state. Bedrock invoke, ECS task logs (CloudWatch), Secrets Manager fetch, and SSM sessions now route via NAT Gateway → public AWS service endpoints (TLS-encrypted; security posture is "AWS-API-over-public-internet" rather than "AWS-API-over-PrivateLink"). REVERSAL TRIGGER: flipenable_interface_endpointsback totrue(or convert to a per-env variable) ininfrastructure/terraform/modules/nquiry-stack/main.tf:60and apply BEFORE the first real customer onboards OR before any FedRAMP / HIPAA audit kicks off. The terraform module retains the resource definitions; reversal is one PR + apply. - Bedrock quotas are independent per account — the core reason for the two-account topology. CC's experiments and smoke runs no longer share quota with live-user traffic.
- Customer-deployed Inqura (licensed track per NQU-408) runs in customer-owned accounts; image pulls via ECR Public (planned, NQU-645 / NQU-646).
Identity Map
Current (single-account)
| Identity | Type | Scope | Notes |
|---|---|---|---|
user/joe-dev | IAM user | AdministratorAccess | Used for day-to-day CC ops AND live operations. R-3: split per env in Phase 3. |
AWS_ROLE_ARN (CI) | OIDC role | CI deploy only | Partial OIDC adoption. |
| ECS task runtime (staging) | ECS task role | invapp-staging-ecs-task-role (961381384763) | Keyless since 2026-05-22 — Bedrock / S3 / marketplace run on the task role alone. No static keys injected. |
| ECS task runtime (prod) | ECS task role | invapp-dev-ecs-task-role (760007728097) | NQU-1006: static-key injections (IAM_ACCESS_KEY_ID / IAM_SECRET_ACCESS_KEY / BEDROCK_ACCESS_KEY_ID / BEDROCK_SECRET_ACCESS_KEY / deprecated ANTHROPIC_API_KEY) removed from the promote-to-prod task-def build; the task role already grants bedrock:InvokeModel* / Rerank / ApplyGuardrail / aws-marketplace:* / S3+KMS. Takes effect on the first promote after the filter lands. Key material retained in invapp-dev/app-secrets for trivial rollback. |
Target (post-Phase 3)
TBD pending Phase 2 ultraplan question #1 (per-environment vs. per-service-per-environment role granularity).
External Services
| Service | Production | Staging (planned) | Local dev (post-NQU-867) |
|---|---|---|---|
| Stripe | Live keys (sk_live / pk_live) in Secrets Manager only | Test keys (sk_test / pk_test) in invapp-staging/app-secrets (live since Wave 2) | Test keys in .env.local (NQU-867) |
| Bedrock | Account-scoped quota on 760007728097 — Sonnet 4.6 + Haiku 4.5 + Titan v2 + Cohere Rerank 3.5 | Account-scoped quota on 961381384763 — Sonnet 4.6 us. RPM raised (L-CCA5DF70 → 10001); Haiku 4.5 us. RPM raise pending; others available out-of-box | Localhost should target Docker-compose stack (NQU-867); historical pattern of hitting prod is the failure mode the two-account topology + NQU-867 eliminate |
| Anthropic-direct | Removed (NQU-783 / ADR-0002, 2026-05-18). Bedrock is the only inference path; resilience via multi-region fallback (us-east-1 → us-west-2) + throttle shedder in lib/ai/client.ts | Removed (same) | n/a (Bedrock only) |
| Sentry | All envs (cross-account-friendly) | TBD | TBD |
| Linear | All envs | n/a (dev-tooling, not env-scoped) | n/a |
| GitHub Actions | Secrets all live-account-scoped (R-6: split per environment in Phase 3) | TBD | n/a |
Current State vs. Target State
Most rows below moved from "Target" to "Current" on 2026-05-22 (NQU-865 Wave 2). The promotion-gate row remains a target until NQU-881 lands the workflow_dispatch gate in Cycle 2 (due 2026-05-31).
| Item | Current (2026-05-22+) | Target / pending |
|---|---|---|
| AWS accounts in use | 2 (prod 760007728097 + staging 961381384763) ✅ | — |
| Prod resource naming | invapp-dev-* (R-8 rename deferred per NQU-878 re-surface triggers) | invapp-prod-* if a re-surface trigger fires |
| IAM identity for ops | Per-account: joe-dev (prod), SSO JE_Vectors (staging), bootstrap cc-staging-bootstrap (transitional) | Delete cc-staging-bootstrap by 2026-06-15 (NQU-871 Wave 4) |
| Bedrock quota isolation | Account-scoped, separate quotas ✅ | — |
| CI credential set | Per-environment OIDC roles (AWS_ROLE_ARN, AWS_ROLE_ARN_STAGING, AWS_ROLE_ARN_STAGING_SMOKE, AWS_ROLE_ARN_E2E) ✅ | — |
| Local dev resources | Docker-compose stack (NQU-867 in flight) | Same |
| Promotion path | main push deploys to BOTH staging and prod in parallel; cycle-end smoke (NQU-855) runs against staging at cycle close | NQU-881 (Cycle 2, due 2026-05-31) — remove deploy from push trigger; promote via workflow_dispatch |
| Pre-prod buffer | Staging environment IS the buffer ✅; prod still hardened by CloudFront WAF + Cognito | — |
How to Use This Doc
For CD spec-writing
When writing a spec that mentions environments, promotion paths, or AWS resources, consult this doc and cite the verified topology in the spec header. Do NOT use generic "production/staging/promote" terms without anchors. See memory [[cd_spec_topology_citations]].
For CC topology-changing PRs
When making a PR that changes the topology — rename, account split, new resource class, IAM change, new external service, env-var change that affects which environment a runtime points at — update this doc in the SAME PR. The doc and the topology should never drift. Update the last-verified: frontmatter to the PR date.
For the cycle-close naming reality check
The scheduled task cycle-close-naming-reality-check (NQU-866 P-9) consults this doc each Sunday to verify spec/comment language across the week matches verified topology.
Ratified Maintenance Decisions (Joe, 2026-05-21)
-
Cross-agent access to this doc — (a) preferred, conditional on mount-add cost. Joe ratified option (a): nquiry-workspace gets surfaced to CC's Cowork mount so CC has direct read/write access to this doc. Topology-changing PRs from CC then include doc updates inline. Fallback to (c) (CC posts updates as Linear comments, CD applies) if CC reports that adding the workspace to its mount is non-trivial. CD pending CC's answer to the mount-cost question.
-
Last-verified cadence — ratified. Re-verify on every topology-changing PR (mandatory) AND at each cycle close via the P-9 scheduled task (
cycle-close-naming-reality-check). Re-verify means: spot-check 2–3 entries against the live account and update thelast-verified:frontmatter. -
Predecessor doc status — ratified historical. Both predecessors marked historical on 2026-05-21:
investigation-app/docs/reference/process/environment-strategy.md— frontmatterstatus: historical,superseded-by: investigation-app/docs/admin/ops/what-runs-where.md, in-body banner.investigation-app/docs/decisions/2026-04-03-staging-to-production-transition.md—Status: Historical — superseded 2026-05-21 by NQU-865/866/868, in-body banner noting this file is the canonical example of the doc-vs-decision drift pattern from the NQU-866 RCA.
Architecture-of-record going forward is this file. Edits to the predecessors should not happen; edit here instead.
References
- NQU-865 — Stand up real prod isolation (the work this doc describes the target of)
- NQU-866 — RCA prod/dev conflation (the incident that motivated this doc)
- NQU-867 — Local dev isolation — rotate Stripe + Docker-compose
- NQU-855 — Cycle-end integration gate (smoke suite consumes the promotion path described here)
- NQU-868 — This doc's creation ticket
- NQU-869 — CC audit-routine default (full-account scope)
- NQU-728 — Bedrock Sonnet 4.6 migration (incident enabled by the conflation)
- Memory:
[[prod_dev_conflation]],[[cd_spec_topology_citations]],[[deferrals_require_resurface]],[[over_deferral_meta_pattern]],[[joe_instincts_are_data]],[[je_vectors_test_account]],[[cd_to_cc_handoff_via_linear]] - Predecessor docs (now superseded by this doc, pending ratification of Open Q-3 above):
investigation-app/docs/ops/environment-strategy.md— first to document "dev is actually prod" (2026-03-27)investigation-app/docs/decisions/2026-04-03-staging-to-production-transition.md— Gate-1 ADR (Status: Proposed; never executed Phase 3 / Decision 7 Resource Rename)
v1 by CD, 2026-05-21, synthesizing CC's NQU-865 Phase 1 inventory + the locked Phase 1 ADR + NQU-867 scope. Pending Joe's review and ratification of the three Open Maintenance Questions above.