Skip to content

AI pipeline

Human-in-the-loop, always

No AI output is auto-approved anywhere in SISS. Every AI-produced field — extracted parameters, compliance verdicts, draft narratives, comment suggestions — is reviewed by an officer who can accept, override, or reject per row. The officer's decision is what gets stored and audited.

Compliance-check pipeline

flowchart TD
  subgraph Ingest["1. Ingest"]
    PDF["PDF"] --> DAI[Document AI OCR<br/>+ layout parser]
    DWG["DWG"] --> EZ[ezdxf / DWG→SVG<br/>entity extraction]
    IFC["IFC"] --> IFCO[IfcOpenShell<br/>bim-svc boundary]
    BORANG["Borang forms"] --> TPL[Templated field<br/>extraction]
  end

  Ingest --> NORM[Normalized params]
  NORM --> RAG[Vertex AI Vector Search<br/>retrieve top-k SIRP rules]
  RAG --> GEM[Gemini on Vertex AI]

  GEM --> OUT[Structured JSON<br/>· extracted parameter table<br/>· verdict per rule<br/>· draft narrative<br/>+ provenance: page, bbox, run id]

  OUT --> HITL[Officer review UI<br/>accept / override / reject per row]
  HITL --> EV[Emit ai.compliance.report.ready]

Provenance on every field

Every AI-produced field stores:

  • model_id — which Vertex model produced it.
  • prompt_version — which prompt template was used.
  • source_ref — document, page, and bounding box for traceability.
  • run_id — the pipeline run that produced it.

This provenance is what makes overrides defensible — an officer can always trace why the AI said what it said.

Eval gates

Every prompt has a golden set. Regression evals run on every prompt / model change. Accuracy thresholds per rule family gate deploy:

Metric Threshold
Parameter extraction F1 ≥ 0.85
Verdict accuracy ≥ 0.90
Malformed-output rate ≤ 1%
Height extraction (M5) ≥ 90% within tolerance

Failures block the pipeline's CI; no "I'll fix it later" merges.

SIRP rule corpus

  • Storage. YAML + markdown in rules/, each with front-matter (rule_id, category, version, applies_to, thresholds).
  • Indexing. Loaded → chunked → embedded (Vertex text-embedding) → upserted to Vertex Matching Engine.
  • Retrieval. Top-k rules per plot context, joined into the prompt.
  • Change management. Rule updates are a first-class admin feature: versioned records, automatic re-embedding, audit entry per rule update.

Where the same pipeline is reused

  • M2 — Compliance checking + Kertas Perakuan / Surat Sokongan draft generation.
  • M3Comment drafting uses the same pipeline with a different corpus (comment library + prior approved comments per ATD / ATL).
  • M5 — Metadata extraction → zoning comparison in the BIM pipeline consumes AI-normalized parameters.

Schema-validated outputs with repair retry

LLM output is validated against a JSON Schema. Invalid output triggers a single "repair" prompt; a second failure flags the submission for manual review rather than being re-tried to exhaustion.