Architecture Decision Records¶
Only decisions that materially constrain the architecture belong here. Domain rules and terminology live directly in Domain.
ADR-001: Brand-as-tenant multi-tenancy¶
Status: accepted · Date: 2026-06-11
Decision: Use one shared application and database. Every brand-owned record carries
brand_id; repository/query APIs enforce tenant scope. GTIN ownership is an explicit
Product Catalog mapping. A Brand User belongs to exactly one Brand.
Why: It provides testable brand isolation and cheap onboarding without multiplying infrastructure.
Trade-off: A tenant-scoping defect can leak data, so isolation requires central tests.
ADR-002: Modular monolith first¶
Status: superseded by ADR-007 · Date: 2026-06-11
Decision: Deploy the MVP as a modular monolith. Product Passport, Personalization & Verdict, Fridge, Identity & Consent, and Brand Analytics are code and ownership boundaries, not independently deployed services.
Why: The budget, team size and iteration speed do not justify microservices.
Trade-off: The course assignment may evolve modules into services later; boundaries must remain explicit enough to permit that extraction.
ADR-003: Resolve from a local catalog, assemble progressively¶
Status: accepted · Date: 2026-06-11
Decision: PackyTrace implements the GS1 Digital Link Resolver. A scan resolves a
per-GTIN ProductCatalogEntry, while lot, serial and expiry remain per-item
ScannedItem data. ResolvedPassport progressively combines the catalog entry with
independent external sections. Open Food Facts and Agribalyse are the MVP sources behind
ACL adapters and circuit breakers.
Why: Local catalog data keeps identity and Verdict computation reliable while external failures degrade individual sections instead of the whole scan.
Trade-off: Catalog data must be cached and kept fresh.
ADR-004: Server-owned consumer state with pseudonymous visitors¶
Status: accepted · Date: 2026-06-11
Decision: Create a stable pseudonymous Visitor ID at first scan and link it when an Account is created. Health Profiles, consent records and Fridges are server-owned. Consent revocation, profile deletion and account deletion remain distinct operations; account deletion nullifies Account references on retained anonymous ScanRecords.
Why: This supports measurement, re-screening, expiry alerts and verifiable erasure without exposing health data in scan requests.
Trade-off: Visitor IDs undercount across cleared storage, devices and shared devices.
ADR-005: Aggregate before the privacy wall¶
Status: accepted · Date: 2026-06-11
Decision: Raw scan, Verdict, Fridge-save and account-link facts remain consumer-side.
Only minimum-group-size BrandMetricBatchPublished aggregates cross into Brand Analytics.
Why: Brands need engagement metrics, but no per-scan or per-visitor trail may cross the privacy wall.
Trade-off: Small groups and real-time individual events cannot appear in dashboards.
ADR-006: Versioned domain-event contract¶
Status: accepted · Date: 2026-06-11
Decision: Events use a common envelope with event ID, type, occurrence time, schema version, correlation ID and causation ID. Payloads contain domain reason codes, not localized presentation messages.
Why: Versioning and tracing are required for reliable asynchronous workflows.
Trade-off: Producers and consumers must maintain schema compatibility.
ADR-007: Microservices from the start¶
Status: accepted · supersedes ADR-002 · Date: 2026-06-11
Decision: Deploy all seven services independently from day one: the five context
services plus api-gateway and measurement-pipeline. The target topology of the
microservices decomposition is built directly, with
no intermediate monolith phase.
Why: The course's Part II/III deliverables — per-service containers, Kafka, and Kubernetes — are the project's real goal. Building the target topology directly avoids a throwaway monolith phase and exercises the service boundaries from the first commit.
Trade-off: More operational surface for a solo developer; MVP iteration is slower than inside a single deployable.
ADR-008: Polyglot service stacks by fit¶
Status: accepted · Date: 2026-06-11
Decision: Each service uses the language that fits its job. Go (chi, pgx + sqlc,
franz-go) for api-gateway, passport-service, fridge-service and
measurement-pipeline — proxying, external-source resilience, event-sourcing folds and
Kafka throughput. TypeScript/Node (Fastify, Kysely, Confluent Kafka client) for
identity-service, personalization-service and brand-analytics-service — auth
ecosystem, fast-iterating rule policies and dashboard-shaped queries. All services share
the same hexagonal layout (domain / application / adapters).
Why: Each language goes where it is strongest, and a polyglot fleet demonstrates real microservice independence rather than asserting it.
Trade-off: Two toolchains to maintain, and cross-cutting plumbing (config, logging, metrics) is implemented twice.
ADR-009: Single Postgres, schema-per-service with per-service roles¶
Status: accepted · Date: 2026-06-11
Decision: One Postgres instance with six schemas — passport,
personalization, fridge, identity, measurement, brand_analytics — and six
roles, one per service, each GRANT-restricted to its own schema. No cross-schema
access. The Fridge event store is an append-only events table plus projections in the
fridge schema; the measurement schema holds short-retention raw facts and
aggregation windows.
Why: A single infrastructure piece keeps local and deployed environments simple, while per-service roles make data ownership enforced by the database rather than by discipline.
Trade-off: A shared instance couples availability: if Postgres is down, every stateful service is down.
ADR-010: Kafka for asynchronous facts, JSON Schema contracts¶
Status: accepted · Date: 2026-06-11
Decision: Run Apache Kafka (KRaft, single node) from day one, carrying only
asynchronous domain facts (ProductScanned, VerdictComputed,
ItemAddedToFridge, ItemConsumed/ItemDiscarded, VisitorLinkedToAccount, the
consent/erasure events, AlertRaised, BrandMetricBatchPublished). Immediate
request/response interactions remain synchronous internal REST per
API Design §3. Event contracts
are versioned JSON Schemas in a shared contracts/ directory — the
ADR-006 envelope plus per-event payloads
only, never domain entities or database models — with code generated for both Go and
TypeScript. Consumers tolerate unknown optional fields; payloads evolve additively.
Why: Independently deployed services need a real broker for the event catalog anyway, and JSON Schema gives contract governance across two languages without a schema registry or entity coupling.
Trade-off: Contract discipline lives in CI checks rather than a registry, and Part III becomes a Kafka deepening (topic design, partitioning, consumer groups, delivery semantics) rather than a re-engineering.
ADR-011: Delegate authentication to Keycloak¶
Status: accepted · Date: 2026-06-11
Decision: Authentication is delegated to a self-hosted Keycloak (OIDC). The
api-gateway validates Keycloak-issued tokens; identity-service keeps only the
domain parts: Visitor identities, visitor→account linking, the consent ledger, and
Brand/BrandUser.
Why: Identity is a generic subdomain, and health-adjacent data must not ride on hand-rolled password auth — password reset, refresh rotation, token revocation and session handling come for free from a hardened provider.
Trade-off: A heavyweight JVM container joins the fleet and its configuration must be versioned alongside the code.
ADR-012: Services own and apply their migrations at startup¶
Status: accepted · Date: 2026-06-13
Decision: Each service carries its schema migrations in its own repository
directory and applies them itself at startup, connecting as its own database role
(ADR-009 already confines every role to its schema via search_path, so unqualified
DDL and the migration version table land in the right schema automatically). Go
services embed plain-SQL migrations and run them with goose as a library; TypeScript
services use Kysely's built-in Migrator. Generated data access (sqlc) reads the same
SQL files as its schema source.
Why: No extra containers, init steps or cross-language tooling — make up and
make smoke keep working unchanged, and a service plus its database schema deploy as
one unit, preserving exclusive data ownership.
Trade-off: Concurrent replicas of one service could race on startup migration (acceptable single-instance; revisit before horizontal scaling), and there is no central migration audit across services.
ADR-013: GS1 anchoring — pragmatic-conformant resolver, strict GTIN, own SDK¶
Status: accepted · Date: 2026-06-13
Decision: This refines how PackyTrace implements the resolver promised in ADR-003.
- Resolver: pragmatic now, conformant-ready. The Resolver parses the Digital Link
and 302-redirects to the product page, but carries an internal
linkTypemodel (defaultgs1:pip, the product-information page). A GS1 Conformant Resolver surface (/.well-known/gs1resolver,application/linkset+json, content-negotiated link resolution) is therefore an additive later step, not a rewrite. - GTIN handling is strict. The canonical catalog key is a mod-10 check-digit-validated, zero-padded GTIN-14. GTIN-8/12/13/14 normalize to GTIN-14 before lookup; an invalid check digit is rejected, never silently mis-resolved. The catalog stores GTIN-14.
- Canonical AIs:
01(GTIN),10(lot),21(serial),17(expiry,YYMMDD, date-validated). Non-standard "friendly" path forms are dropped. - Parsing/validation lives in a standalone GS1 Digital Link SDK owned by the
organisation and bound for open source — pure GS1 General Specifications / Syntax
Dictionary logic with zero PackyTrace domain concepts (no Brand, catalog, or
verdict ever enter it). It ships parity Go and TypeScript implementations over one
shared golden test-vector corpus, and is the single source of truth; the existing
frontend
gs1-decoder.tsis demoted to an offline UX hint and replaced by the SDK's TS package. For the thin slice the SDK lives in-repo as an isolated, independently versioned package; extraction to its own repository and publication (Go module + npm) is post-slice.
Why: GS1 is the platform's namesake standard, so correctness (no wrong-product resolves) and a credible path to interoperability outweigh a certified resolver on day one. A generic SDK keeps the standard logic in one tested place across the Go backend and TS frontend, prevents parser drift, and turns a course requirement into a reusable open-source artefact. Because the SDK is generic GS1 logic and not PackyTrace code, it is an ordinary dependency under ADR-010, not cross-service code sharing — a boundary that holds only while it stays free of PackyTrace domain concepts.
Trade-off: Go/TS parity must be enforced by the shared vector corpus rather than a
single binary; full resolver conformance (/.well-known/gs1resolver, linkset) and the
open-source extraction are deferred.
ADR-014: AWS deployment — single ARM box, compose, Terraform¶
Status: accepted · Date: 2026-06-16
Context: The platform must run on a real public URL within AWS free-plan credits (~$100–200, ~6 months). The managed-service shape (ECS Fargate per service + RDS + MSK) costs ~$200+/month — MSK alone exhausts the credits in weeks — so it is not viable on a free budget. The HTTPS requirement is hard: barcode scanning needs camera access, which browsers only grant over TLS.
Decision: Deploy the existing container stack onto one EC2 instance
(c7i-flex.large, 4 GB x86_64) running Docker Compose, provisioned with
Terraform (deployment/aws/, account 398152419692, eu-central-1). The
instance type is constrained by the AWS Free Plan, which only permits launching a
fixed allowlist of types (ARM t4g.medium is not on it; c7i-flex.large is the
roomiest 4 GB option available).
- Self-hosted data plane. Postgres, the broker, and Keycloak run as containers on the box, not as RDS/MSK. The schema-per-service role isolation (ADR-009) is preserved exactly — the prod Postgres init creates the same GRANT-limited roles, with passwords injected from SSM instead of the dev defaults. Keycloak gets its own Postgres database (production mode forbids the dev H2 store) and stays internal.
- Redpanda replaces Apache Kafka in the deployed stack only (Kafka-API compatible, JVM-free) so the broker fits the box's RAM. Dev/CI keep Apache Kafka; franz-go is unchanged. No contract or ADR-010 change — same topics, same envelope.
- Caddy is the only public surface. It terminates TLS (Let's Encrypt for the
configured host), serves the SPA, and proxies
/api/*,/01/*,/health— same-origin, no CORS. Postgres, Redpanda, Keycloak, and all 7 services are internal-only. Because auth is server-side ROPC (ADR-011), Keycloak is never exposed and the token issuer stays its internal address. - Images are built amd64 in CI and pushed to GHCR; the box pulls via a
read:packagesPAT. Shell access is SSM Session Manager (no SSH, no key pairs). Secrets are generated by Terraform and stored as SSM SecureStrings; state and data persist on a separate EBS volume so the instance is replaceable.
Why: A single box is the only shape that fits the budget while keeping the
architecture's load-bearing invariants (DB-enforced tenant/schema isolation, Keycloak
auth, contracts-only coupling) intact. Terraform makes the whole footprint reproducible
and destroy-able to stop spending.
Trade-off: No high availability — the box is a single point of failure, and a
replacement incurs cold-start + cert re-issue. This is deployment topology only; it
does not relax any service boundary (ADR-008/009/010) or the auth/privacy rules
(ADR-001/005/011). Migrating to managed services later is a deployment change, not a
code change. Backups (EBS snapshots / pg_dump to S3) and a CDN are deferred.