We’re hiring a senior, hands-on Data Platform Engineer to design and build a modern data platform from the ground up. This is a build-first role, focused on engineering quality, scalability, and real delivery. You’ll take ownership of creating a new open-source-led data lake / lakehouse, integrating multiple data sources and enabling analytics, operational use cases, and future AI/ML workloads.

Role

Design and build a greenfield data lake / lakehouse platform
Engineer high-throughput batch and streaming pipelines
Implement scalable processing using open-source technologies, including:
- Apache Spark (batch and structured processing)
- Apache Flink (real-time and streaming pipelines)
- Trino or equivalent distributed SQL engines
Implement and operate modern table formats such as Apache Iceberg, Delta Lake, or Hudi
Build ingestion, transformation, and consolidation frameworks across multiple data sources
Own delivery end-to-end, from design through production, optimisation, and support
Ensure data is reliable, scalable, and usable for analytics, reporting, and AI/ML use cases

Requirements

10+ years commercial experience
Proven experience building and operating data platforms in production
End to end Datalake design and build experience
Hands-on experience with:
- Apache Spark
- Apache Flink or equivalent streaming engines
- Apache Iceberg, Delta Lake, or Apache Hudi
Experience working with object-storage-backed data lakes (S3, ADLS, GCS, MinIO)
Strong coding skills in Python, Scala, or Java
Experience integrating multiple data sources into a unified platform
Comfortable owning technical decisions and working in delivery-focused environments
Pragmatic mindset with a bias towards building, optimising, and shipping