Senior Data Architect

Job Description

Data Architect
GCP × Databricks | Platform & Data Architecture | Permanent / Senior IC
Function
Data & Platform Engineering
Level
Principal / Staff Architect
Employment
Permanent, Full-Time
Location
Remote-Friendly / Hybrid
Experience
8+ Years (Architecture Focus)
GCP Expertise
Professional / Expert Level
Databricks
Certified Preferred
Reports To
VP / Head of Data Platform

Position Summary
We are looking for a Principal Data Architect with mastery across both Google Cloud Platform (GCP) and Databricks to lead the design
and evolution of our enterprise data platform. This is a senior individual contributor role with broad influence — you will set the
architectural direction for how data is ingested, stored, transformed, governed, and consumed across the organisation.
The ideal candidate brings deep, hands-on expertise in both ecosystems — not surface-level familiarity — and has a proven track record
of designing production-grade, scalable data platforms that serve analytics, machine learning, and operational workloads. You will work
at the intersection of strategy and engineering, translating business requirements into robust technical blueprints while mentoring
engineering teams on their implementation.
Key Responsibilities
Platform Architecture & Design

Define the end-to-end architecture of the enterprise data platform spanning GCP (BigQuery, Dataproc, Cloud Composer,
Pub/Sub) and Databricks (Unity Catalog, Delta Live Tables, MLflow)
Design and govern the lakehouse architecture — including bronze/silver/gold medallion layers, Delta Lake table design, and data
lifecycle policies
Architect data ingestion patterns for batch, micro-batch, and real-time streaming workloads across both platforms
Evaluate and select tooling, frameworks, and services — balancing cost, performance, operational overhead, and strategic fit
Produce authoritative architecture artefacts: HLDs, LLDs, data flow diagrams, decision logs (ADRs), and reference architectures
Data Modelling & Governance
Design logical and physical data models — dimensional, normalised, and domain-oriented — appropriate to use case and access
pattern
Establish and enforce data governance standards: cataloguing (Dataplex, Unity Catalog), lineage tracking, access control, and data
classification
Define and implement data contracts between producing and consuming teams
Lead the adoption of data mesh or domain-oriented data ownership principles where appropriate
Engineering Enablement & Standards
Set engineering standards for PySpark / SQL development, pipeline design patterns, and testing practices across the data engineering
function
Define CI/CD practices for data pipelines — including environment promotion, schema change management, and automated testing
gates
Champion infrastructure-as-code (Terraform) for reproducible GCP and Databricks environment provisioning
Establish observability standards — data quality monitoring, pipeline SLAs, alerting, and incident response runbooks
Stakeholder & Cross-Functional Leadership
Partner with data engineering, data science, analytics, and product teams to translate requirements into actionable architectural
decisions
Engage with GCP and Databricks account and technical teams to leverage roadmap features and managed support
Provide technical oversight and architectural review on major initiatives, ensuring alignment to the target state platform
Mentor senior engineers, conduct design reviews, and elevate architectural thinking across the data organisation

Required Skills & Experience
Google Cloud Platform — Expert Level

Deep hands-on experience designing production workloads on GCP data services: BigQuery (partitioning, clustering, BI
Engine, materialized views), Dataproc, Cloud Composer (Airflow), Pub/Sub, Dataflow, and GCS
Expert understanding of GCP IAM, VPC Service Controls, and security architecture for data platforms
Experience designing multi-region, HA data architectures on GCP with DR considerations
Proficiency with GCP cost optimisation strategies — slot reservations, storage tiers, autoscaling, and committed use discounts
Familiarity with Vertex AI and its integration with the GCP data ecosystem for ML workloads
Databricks — Expert Level
Mastery of the Databricks Lakehouse Platform: Unity Catalog, Delta Lake internals, Delta Live Tables (DLT), and Photon
engine
Deep experience designing Databricks workspace architecture — cluster policies, job compute, SQL warehouses, and access tiers
Expertise in Delta Lake optimisation: Z-ordering, OPTIMIZE, VACUUM, liquid clustering, and change data feed
Hands-on experience with Databricks MLflow for experiment tracking and model registry in production
Proficiency with Databricks Asset Bundles (DABs) or Terraform provider for IaC-based workspace management
Core Data Architecture
8+ years in data engineering or data architecture roles, with at least 3 years in a dedicated architecture capacity
Strong data modelling skills — dimensional modelling (Kimball), Data Vault 2.0, and domain-driven design applied to data
Expert-level SQL and PySpark — ability to review, advise on, and benchmark complex transformations
Proven experience designing real-time and streaming architectures using Kafka, Pub/Sub, or Kinesis alongside Spark Structured
Streaming
Deep understanding of data quality frameworks: Great Expectations, dbt tests, Soda, or equivalent
Experience with metadata management and data cataloguing tools: Google Dataplex, Unity Catalog, Apache Atlas, or Collibra
Certifications (Preferred)
Google Cloud Professional Data Engineer or Professional Cloud Architect
Databricks Certified Data Engineer Professional or Databricks Certified Associate Developer for Apache Spark
Additional: dbt Certified Developer, AWS Solutions Architect (for polycloud exposure)

Technical Environment
Google Cloud Platform Stack

BigQuery
Dataproc
Cloud Composer
Pub/Sub
Dataflow
GCS
Dataplex
Vertex AI
Databricks Stack
Unity Catalog
Delta Lake
Delta Live Tables
MLflow
Photon Engine
SQL Warehouses
Databricks Asset Bundles
Workflows
Cross-Platform Toolchain
Apache Spark 3.x
PySpark / SQL
dbt Core / Cloud
Great Expectations
Terraform
Apache Kafka
Apache Airflow
GitHub Actions

Leadership & Behavioural Competencies
Beyond technical mastery, the successful candidate will demonstrate the following:
Architectural Thinking
Approaches problems from first principles; balances pragmatism with long-term platform health
Communication
Translates complex technical concepts clearly for engineering peers and non-technical executives alike
Decisiveness
Makes well-reasoned architectural decisions under ambiguity and documents them transparently via ADRs
Mentorship
Actively grows the architectural capability of the broader data engineering team through pairing and review
Vendor Acumen
Navigates GCP and Databricks roadmaps, partnerships, and commercial levers to the organisation’s advantage
Bias for Quality
Champions data quality, observability, and operational excellence as non-negotiable engineering standard

Thank you for your interest in this role. Please also share your CV at Vedika@lsarecruit.co.uk and if suitable, we will get in touch with you to discuss further.

About Us

Contact Info

Job Description

About Us

Contact Info

Job Description

Apply for this Position