Observability¶
One submission corresponds to one distributed trace across every service it touches. Every state-changing and auth event is captured for audit.
Tracing¶
- OpenTelemetry SDK in every service → Cloud Trace, Cloud Logging, Cloud Monitoring.
- Trace context propagates through Pub/Sub messages and Temporal activities via a custom propagator.
- The trace ID is logged on every structured log line — operators can jump from a log entry straight to the full trace.
flowchart LR
SPA[SPA] -->|trace_id=X| GW[Gateway]
GW -->|trace_id=X| SUB[submission-svc]
SUB -->|trace_id=X<br/>via Pub/Sub| WF[workflow-svc]
WF -->|trace_id=X<br/>via Temporal activity| AI[ai-svc]
AI -->|trace_id=X| SIGN[signing-svc]
SUB --> CT[(Cloud Trace)]
WF --> CT
AI --> CT
SIGN --> CT
SLOs¶
| SLO | Target |
|---|---|
| Submission intake availability | 99.9% |
| AI compliance report latency | p95 < 5 min |
| BIM tile first-byte latency | p95 < 2 s |
| SLA reminder delivery success | 99.5% |
Error budgets and burn-rate alerts are wired to on-call paging.
Audit stream¶
- Every state-changing domain event and every auth event is published to Pub/Sub.
- A BigQuery sink subscribes to those topics. The dataset is append-only, retained 10 years.
- Auditors read a purpose-built BQ dataset + Looker Studio workspace. They never touch service databases directly.
Tamper evidence¶
A daily hash-chain digest of the audit stream is signed by Cloud KMS and published to a public transparency log endpoint. Any retroactive tampering with audit rows would break the chain and be externally detectable.
flowchart LR
EV[Audit events<br/>Pub/Sub] --> BQ[(BigQuery<br/>append-only)]
BQ -->|nightly| HASH[Hash chain digest]
HASH --> KMS[KMS sign]
KMS --> TLOG[Public transparency log]
PDPA subject requests¶
Export and erasure operate on submission_id / user_id. Erasure is
tombstone + redact, never physical delete, so the audit chain stays intact.
Runbook hooks¶
Every SLO alert links to a runbook (maintained in docs/runbooks/ — M7
deliverable). Alerts that don't link to a runbook are considered a bug and
must be fixed before the next release.