Position Responsibilities
· Work in a dedicated data reporting and analytics team building a world-class data platform to produce data-driven insights for Transact Campus and our clients
· Analyze, interpret and orchestrate complex data across disparate sources comprising unstructured, semi-structured and structured datasets in streaming and batch modes
· Design and develop real-time data pipelines using the latest Databricks and Delta Lake Azure cloud technologies
· Collaborate with data consumers (reporting, analysis, or data science) to provide metrics that meet their needs
· Contribute to standards for data producers streaming data into the Lakehouse
· Test commercial software products using both manual and automated testing processes
· Support the application lifecycle during QA, UAT, and post-release phases
· Comply with and contribute to consistent development guidelines (coding, change control, build, versioning)
· Participate in peer code reviews
Required Skills
· Bachelor’s degree in Computer Science, IT or related field OR equivalent related work experience – preferably with a focus on Data Analytics
· 5+ years of experience in software enterprise-level Data Engineering
· Expertise in big data workloads
· Experience with data lakes and scale-out processing
· Relational database design and best practices
· Hands-on experience designing and developing Spark data pipelines
· Strong SQL and Python skills
· Understanding of cloud-based big data workloads
· Experience with ETL/ELT patterns, preferably using Databricks jobs
· Excellent technical documentation skills
· Experience with data lakes (HDFS, Azure Data Lake or AWS S3)
· Experience with source code management systems such as Git/TFS/SVN
· Experience working in Agile teams (Scrum, XP, Kanban)
· Ability to present ideas and insights to business stakeholders
· Fluency in written and spoken English
Preferred Skills
· Good understanding of Azure Data Services (Azure Databricks, Azure Data Factory, Azure Data Lake Gen 2)
· Experience with Databricks Delta Lake, Delta Sharing, and Delta Live Tables
· Experience with Spark Structured Streaming
· Experience with NoSQL databases
· Experience with Infrastructure as Code technologies such as Terraform or ARM
· Experience in Data Science and ML methodologies
· Experience with Azure services for streaming data (EventHub, EventGrid)
· Understanding of Data strategy including Data Governance and Data management
#LI-VH2