We’re hiring a senior, hands-on Data Platform Engineer to design and build a modern data platform from the ground up. This is a build-first role, focused on engineering quality, scalability, and real delivery. You’ll take ownership of creating a new open-source-led data lake / lakehouse, integrating multiple data sources and enabling analytics, operational use cases, and future AI/ML workloads.
Role
- Design and build a greenfield data lake / lakehouse platform
- Engineer high-throughput batch and streaming pipelines
- Implement scalable processing using open-source technologies, including:
- Apache Spark (batch and structured processing)
- Apache Flink (real-time and streaming pipelines)
- Trino or equivalent distributed SQL engines
- Implement and operate modern table formats such as Apache Iceberg, Delta Lake, or Hudi
- Build ingestion, transformation, and consolidation frameworks across multiple data sources
- Own delivery end-to-end, from design through production, optimisation, and support
- Ensure data is reliable, scalable, and usable for analytics, reporting, and AI/ML use cases
Requirements
- 10+ years commercial experience
- Proven experience building and operating data platforms in production
- End to end Datalake design and build experience
- Hands-on experience with:
- Apache Spark
- Apache Flink or equivalent streaming engines
- Apache Iceberg, Delta Lake, or Apache Hudi
- Experience working with object-storage-backed data lakes (S3, ADLS, GCS, MinIO)
- Strong coding skills in Python, Scala, or Java
- Experience integrating multiple data sources into a unified platform
- Comfortable owning technical decisions and working in delivery-focused environments
- Pragmatic mindset with a bias towards building, optimising, and shipping