Reference architecture

An AI-ready data platform.

Modern data work asks for more than a warehouse and a dashboard. It needs deployable infrastructure, end-to-end observability, governed access, and first-class footing for AI agents. This is what that looks like — built cleanly, end to end.

Capabilities

Automated deployments

GitOps as the single source of truth. Every change reviewed, applied, and reconciled automatically. No drift, no surprises, no missing runbook.

Observability

Logs, metrics, and traces unified in one workspace. Service-level visibility from the ingestion edge down to a single query plan, queryable in seconds.

Telemetry

OpenTelemetry-native instrumentation across pipelines and services. Standard formats, standard collectors, no vendor lock-in.

Identity & governance

Single sign-on, fine-grained roles, and full audit trails. Every action ties back to a person, every dataset to an owner.

Data warehouse

The trusted, modeled core. Transformed, tested, and versioned with the same rigor as application code.

Data lake

Raw and curated data on open formats and object storage. Decoupled from any single compute engine, reusable across them all.

Visualisation

Dashboards and exploration that share the same definitions as the warehouse. One number, one meaning, everywhere it appears.

Semantic layer

Metrics defined once, consumed everywhere — by humans, by BI tools, by agents. The contract between data and the people who use it.

MCP & agents

Tooling exposed via Model Context Protocol so AI agents can query, transform, and act on data the same way an analyst would. Safely, with the same guardrails.

Workflow orchestration

Reliable, observable scheduling for ingestion, transformation, and ML jobs. Failures notice themselves; reruns don't surprise anyone.

Self-service

Sane defaults so analysts and engineers can ship without filing tickets. Sharp guardrails so the platform doesn't get worse over time.

Cost & capacity

Per-team, per-pipeline visibility into spend and resource use. Decisions about scale are decisions, not guesses.

Approach

A platform isn't a list of products — it's the way the products fit together, the contracts they expose, and the discipline that keeps it consistent as it grows. The capabilities above are the surface; underneath, a few principles do the work.

  • Open formats, open standards. Replaceable parts, not a stack you can't leave.
  • Declarative everything. If it's not in git, it isn't.
  • One source of truth per concept. Identity, metrics, lineage — defined once, referenced everywhere.
  • AI as a first-class consumer. Agents query the same semantic layer humans do, through the same contracts.
  • Boring where boring is right. Mature components, minimal moving parts, room to evolve.