Summary of role:
We are seeking a Data Engineer with extensive ETL experience to join our team. The ideal candidate will have a strong background in data integration, data warehousing, and data transformation processes. You will be responsible for designing, developing, and implementing complex ETL solutions to support our cloud based initiatives.
Primary duties may include but are not limited to:
- Design, develop, and maintain high-performance, scalable, and reliable ETL processes to support data integration, data migration, and data transformation needs in data warehouse environments.
- Collaborate with business analysts, data scientists and architects to define and understand business requirements and translate them into ETL specifications for data warehouse solutions.
- Analyse source systems, data models, and data structures to ensure data quality and consistency across data storage systems.
- Perform data profiling, data cleansing, and data validation to ensure accurate data transformation and integration.
- Develop and maintain ETL process documentation, including data flow diagrams, data mapping documents, and test cases for data warehouse implementations.
- Optimise ETL performance by employing best practices, monitoring performance metrics, and troubleshooting issues.
- Participate in code reviews, peer feedback sessions, and continuous improvement initiatives for ETL processes in data warehouse architectures.
- Work closely with data warehouse and database administrators to ensure optimal data storage and retrieval processes.
- Provide technical support and guidance to other team members on ETL processes and data integration.
- Stay current with industry trends, emerging technologies, and best practices in ETL development, data integration, and cloud-based data warehousing.
Requirements:
- Bachelor's degree in Computer Science, Information Systems, or a related field.
- 5+ years of experience in ETL development, data integration, or data warehousing, including experience with cloud-based data warehouse solutions.
- Experience working with cloud-based data warehouse platforms such as Amazon Redshift or Snowflake.
- Familiarity with AWS services, including S3, Glue, Lambda, and EMR, for data storage, processing, and integration.
- Experience working with relational databases (e.g., SQL Server, Oracle, MySQL) and NoSQL databases (e.g., MongoDB, Cassandra) in both on-premises and cloud environments.
- Proficient in SQL and data manipulation languages (DML) for data querying and transformation.
- Familiarity with big data technologies, such as Hadoop, Spark, and Kafka, is a plus.
- Strong analytical and problem-solving skills, with the ability to understand complex data structures and relationships in both cloud-based and traditional data warehouse environments.
- Proficiency in programming with Python or Scala.
- Familiarity with version control and CI/CD tools and practices.
- Excellent communication and collaboration skills, with the ability to work effectively with cross-functional teams.
- Ability to manage multiple tasks and priorities in a fast-paced environment.
- Familiarity with data privacy and security best practices, especially as they pertain to cloud-based data storage and processing.