Architecture Decision Records¶

Only decisions that materially constrain the architecture belong here. Domain rules and terminology live directly in Domain.

ADR-001: Brand-as-tenant multi-tenancy¶

Status: accepted · Date: 2026-06-11

Decision: Use one shared application and database. Every brand-owned record carries brand_id; repository/query APIs enforce tenant scope. GTIN ownership is an explicit Product Catalog mapping. A Brand User belongs to exactly one Brand.

Why: It provides testable brand isolation and cheap onboarding without multiplying infrastructure.

Trade-off: A tenant-scoping defect can leak data, so isolation requires central tests.

ADR-002: Modular monolith first¶

Status: superseded by ADR-007 · Date: 2026-06-11

Decision: Deploy the MVP as a modular monolith. Product Passport, Personalization & Verdict, Fridge, Identity & Consent, and Brand Analytics are code and ownership boundaries, not independently deployed services.

Why: The budget, team size and iteration speed do not justify microservices.

Trade-off: The course assignment may evolve modules into services later; boundaries must remain explicit enough to permit that extraction.

ADR-003: Resolve from a local catalog, assemble progressively¶

Status: accepted · Date: 2026-06-11

Decision: PackyTrace implements the GS1 Digital Link Resolver. A scan resolves a per-GTIN ProductCatalogEntry, while lot, serial and expiry remain per-item ScannedItem data. ResolvedPassport progressively combines the catalog entry with independent external sections. Open Food Facts and Agribalyse are the MVP sources behind ACL adapters and circuit breakers.

Why: Local catalog data keeps identity and Verdict computation reliable while external failures degrade individual sections instead of the whole scan.

Trade-off: Catalog data must be cached and kept fresh.

ADR-004: Server-owned consumer state with pseudonymous visitors¶

Status: accepted · Date: 2026-06-11

Decision: Create a stable pseudonymous Visitor ID at first scan and link it when an Account is created. Health Profiles, consent records and Fridges are server-owned. Consent revocation, profile deletion and account deletion remain distinct operations; account deletion nullifies Account references on retained anonymous ScanRecords.

Why: This supports measurement, re-screening, expiry alerts and verifiable erasure without exposing health data in scan requests.

Trade-off: Visitor IDs undercount across cleared storage, devices and shared devices.

ADR-005: Aggregate before the privacy wall¶

Status: accepted · Date: 2026-06-11

Decision: Raw scan, Verdict, Fridge-save and account-link facts remain consumer-side. Only minimum-group-size BrandMetricBatchPublished aggregates cross into Brand Analytics.

Why: Brands need engagement metrics, but no per-scan or per-visitor trail may cross the privacy wall.

Trade-off: Small groups and real-time individual events cannot appear in dashboards.

ADR-006: Versioned domain-event contract¶

Status: accepted · Date: 2026-06-11

Decision: Events use a common envelope with event ID, type, occurrence time, schema version, correlation ID and causation ID. Payloads contain domain reason codes, not localized presentation messages.

Why: Versioning and tracing are required for reliable asynchronous workflows.

Trade-off: Producers and consumers must maintain schema compatibility.

ADR-007: Microservices from the start¶

Status: accepted · supersedes ADR-002 · Date: 2026-06-11

Decision: Deploy all seven services independently from day one: the five context services plus api-gateway and measurement-pipeline. The target topology of the microservices decomposition is built directly, with no intermediate monolith phase.

Why: The course's Part II/III deliverables — per-service containers, Kafka, and Kubernetes — are the project's real goal. Building the target topology directly avoids a throwaway monolith phase and exercises the service boundaries from the first commit.

Trade-off: More operational surface for a solo developer; MVP iteration is slower than inside a single deployable.

ADR-008: Polyglot service stacks by fit¶

Status: accepted · Date: 2026-06-11

Decision: Each service uses the language that fits its job. Go (chi, pgx + sqlc, franz-go) for api-gateway, passport-service, fridge-service and measurement-pipeline — proxying, external-source resilience, event-sourcing folds and Kafka throughput. TypeScript/Node (Fastify, Kysely, Confluent Kafka client) for identity-service, personalization-service and brand-analytics-service — auth ecosystem, fast-iterating rule policies and dashboard-shaped queries. All services share the same hexagonal layout (domain / application / adapters).

Why: Each language goes where it is strongest, and a polyglot fleet demonstrates real microservice independence rather than asserting it.

Trade-off: Two toolchains to maintain, and cross-cutting plumbing (config, logging, metrics) is implemented twice.

ADR-009: Single Postgres, schema-per-service with per-service roles¶

Status: accepted · Date: 2026-06-11

Decision: One Postgres instance with six schemas — passport, personalization, fridge, identity, measurement, brand_analytics — and six roles, one per service, each GRANT-restricted to its own schema. No cross-schema access. The Fridge event store is an append-only events table plus projections in the fridge schema; the measurement schema holds short-retention raw facts and aggregation windows.

Why: A single infrastructure piece keeps local and deployed environments simple, while per-service roles make data ownership enforced by the database rather than by discipline.

Trade-off: A shared instance couples availability: if Postgres is down, every stateful service is down.

ADR-010: Kafka for asynchronous facts, JSON Schema contracts¶

Status: accepted · Date: 2026-06-11

Decision: Run Apache Kafka (KRaft, single node) from day one, carrying only asynchronous domain facts (ProductScanned, VerdictComputed, ItemAddedToFridge, ItemConsumed/ItemDiscarded, VisitorLinkedToAccount, the consent/erasure events, AlertRaised, BrandMetricBatchPublished). Immediate request/response interactions remain synchronous internal REST per API Design §3. Event contracts are versioned JSON Schemas in a shared contracts/ directory — the ADR-006 envelope plus per-event payloads only, never domain entities or database models — with code generated for both Go and TypeScript. Consumers tolerate unknown optional fields; payloads evolve additively.

Why: Independently deployed services need a real broker for the event catalog anyway, and JSON Schema gives contract governance across two languages without a schema registry or entity coupling.

Trade-off: Contract discipline lives in CI checks rather than a registry, and Part III becomes a Kafka deepening (topic design, partitioning, consumer groups, delivery semantics) rather than a re-engineering.

ADR-011: Delegate authentication to Keycloak¶

Status: accepted · Date: 2026-06-11

Decision: Authentication is delegated to a self-hosted Keycloak (OIDC). The api-gateway validates Keycloak-issued tokens; identity-service keeps only the domain parts: Visitor identities, visitor→account linking, the consent ledger, and Brand/BrandUser.

Why: Identity is a generic subdomain, and health-adjacent data must not ride on hand-rolled password auth — password reset, refresh rotation, token revocation and session handling come for free from a hardened provider.

Trade-off: A heavyweight JVM container joins the fleet and its configuration must be versioned alongside the code.

ADR-012: Services own and apply their migrations at startup¶

Status: accepted · Date: 2026-06-13

Decision: Each service carries its schema migrations in its own repository directory and applies them itself at startup, connecting as its own database role (ADR-009 already confines every role to its schema via search_path, so unqualified DDL and the migration version table land in the right schema automatically). Go services embed plain-SQL migrations and run them with goose as a library; TypeScript services use Kysely's built-in Migrator. Generated data access (sqlc) reads the same SQL files as its schema source.

Why: No extra containers, init steps or cross-language tooling — make up and make smoke keep working unchanged, and a service plus its database schema deploy as one unit, preserving exclusive data ownership.

Trade-off: Concurrent replicas of one service could race on startup migration (acceptable single-instance; revisit before horizontal scaling), and there is no central migration audit across services.

ADR-013: GS1 anchoring — pragmatic-conformant resolver, strict GTIN, own SDK¶

Status: accepted · Date: 2026-06-13

Decision: This refines how PackyTrace implements the resolver promised in ADR-003.

Resolver: pragmatic now, conformant-ready. The Resolver parses the Digital Link and 302-redirects to the product page, but carries an internal linkType model (default gs1:pip, the product-information page). A GS1 Conformant Resolver surface (/.well-known/gs1resolver, application/linkset+json, content-negotiated link resolution) is therefore an additive later step, not a rewrite.
GTIN handling is strict. The canonical catalog key is a mod-10 check-digit-validated, zero-padded GTIN-14. GTIN-8/12/13/14 normalize to GTIN-14 before lookup; an invalid check digit is rejected, never silently mis-resolved. The catalog stores GTIN-14.
Canonical AIs: 01 (GTIN), 10 (lot), 21 (serial), 17 (expiry, YYMMDD, date-validated). Non-standard "friendly" path forms are dropped.
Parsing/validation lives in a standalone GS1 Digital Link SDK owned by the organisation and bound for open source — pure GS1 General Specifications / Syntax Dictionary logic with zero PackyTrace domain concepts (no Brand, catalog, or verdict ever enter it). It ships parity Go and TypeScript implementations over one shared golden test-vector corpus, and is the single source of truth; the existing frontend gs1-decoder.ts is demoted to an offline UX hint and replaced by the SDK's TS package. For the thin slice the SDK lives in-repo as an isolated, independently versioned package; extraction to its own repository and publication (Go module + npm) is post-slice.

Why: GS1 is the platform's namesake standard, so correctness (no wrong-product resolves) and a credible path to interoperability outweigh a certified resolver on day one. A generic SDK keeps the standard logic in one tested place across the Go backend and TS frontend, prevents parser drift, and turns a course requirement into a reusable open-source artefact. Because the SDK is generic GS1 logic and not PackyTrace code, it is an ordinary dependency under ADR-010, not cross-service code sharing — a boundary that holds only while it stays free of PackyTrace domain concepts.

Trade-off: Go/TS parity must be enforced by the shared vector corpus rather than a single binary; full resolver conformance (/.well-known/gs1resolver, linkset) and the open-source extraction are deferred.

ADR-014: AWS deployment — single ARM box, compose, Terraform¶

Status: accepted · Date: 2026-06-16

Context: The platform must run on a real public URL within AWS free-plan credits (~$100–200, ~6 months). The managed-service shape (ECS Fargate per service + RDS + MSK) costs ~$200+/month — MSK alone exhausts the credits in weeks — so it is not viable on a free budget. The HTTPS requirement is hard: barcode scanning needs camera access, which browsers only grant over TLS.

Decision: Deploy the existing container stack onto one EC2 instance (c7i-flex.large, 4 GB x86_64) running Docker Compose, provisioned with Terraform (deployment/aws/, account 398152419692, eu-central-1). The instance type is constrained by the AWS Free Plan, which only permits launching a fixed allowlist of types (ARM t4g.medium is not on it; c7i-flex.large is the roomiest 4 GB option available).

Self-hosted data plane. Postgres, the broker, and Keycloak run as containers on the box, not as RDS/MSK. The schema-per-service role isolation (ADR-009) is preserved exactly — the prod Postgres init creates the same GRANT-limited roles, with passwords injected from SSM instead of the dev defaults. Keycloak gets its own Postgres database (production mode forbids the dev H2 store) and stays internal.
Redpanda replaces Apache Kafka in the deployed stack only (Kafka-API compatible, JVM-free) so the broker fits the box's RAM. Dev/CI keep Apache Kafka; franz-go is unchanged. No contract or ADR-010 change — same topics, same envelope.
Caddy is the only public surface. It terminates TLS (Let's Encrypt for the configured host), serves the SPA, and proxies /api/*, /01/*, /health — same-origin, no CORS. Postgres, Redpanda, Keycloak, and all 7 services are internal-only. Because auth is server-side ROPC (ADR-011), Keycloak is never exposed and the token issuer stays its internal address.
Images are built amd64 in CI and pushed to GHCR; the box pulls via a read:packages PAT. Shell access is SSM Session Manager (no SSH, no key pairs). Secrets are generated by Terraform and stored as SSM SecureStrings; state and data persist on a separate EBS volume so the instance is replaceable.

Why: A single box is the only shape that fits the budget while keeping the architecture's load-bearing invariants (DB-enforced tenant/schema isolation, Keycloak auth, contracts-only coupling) intact. Terraform makes the whole footprint reproducible and destroy-able to stop spending.

Trade-off: No high availability — the box is a single point of failure, and a replacement incurs cold-start + cert re-issue. This is deployment topology only; it does not relax any service boundary (ADR-008/009/010) or the auth/privacy rules (ADR-001/005/011). Migrating to managed services later is a deployment change, not a code change. Backups (EBS snapshots / pg_dump to S3) and a CDN are deferred.