Lead Data Engineer, Dubai. Python, Pyspark, Big Data, AWS, S3, EMP, Redshift, Glue etc. REF 235
Job Description – Lead Data Engineer
Our multinational client is seeking a highly skilled and motivated Lead Data Engineer to join our dynamic team. As the Lead Data Engineer, you will play a crucial role in designing, implementing, and maintaining our data infrastructure and pipelines. You will lead a team of data engineers, collaborate with data scientists, architects, and work closely with other cross-functional teams to deliver high-quality, reliable, and scalable data solutions.
• Bachelor’s or Master’s degree in computer science, Engineering, or a related field.
• Proven experience (8+ years) as a Data Engineer, with demonstrated expertise in leading data engineering teams.
• Proficiency in Python and PySpark, with a strong understanding of distributed computing and big data processing.
• Extensive experience with AWS data services, including but not limited to S3, EMR, Redshift, Glue, Lake Formation, Lambda, SNS and CloudWatch.
- Strong knowledge of database systems, data modelling, and data warehousing principles.
- Experience in data pipeline orchestration tools like Apache Airflow and Step Functions.
- Familiarity with data governance, security, and compliance practices.
- Excellent problem-solving and analytical skills, with a keen attention to detail.
- Strong communication skills and ability to effectively convey technical concepts to non-technical stakeholders.
- Leadership and mentoring abilities, with a track record of successfully leading and managing a team of data engineers.
- Ability to work in a fast-paced, dynamic environment and adapt to changing priorities.
- Architect and Design Data Solutions: Lead the design and architecture of scalable, efficient, and robust data pipelines and systems to handle large volumes of data from various sources.
- Team Leadership: Manage and mentor a team of data engineers, ensuring successful project execution, fostering a collaborative environment, and providing technical guidance.
- Data Integration: Oversee the integration of data from multiple sources, both internal and external, to ensure data consistency and accuracy.
- Data Transformation: Develop and implement data transformation processes using Python and PySpark, ensuring data quality and proper data governance.
- Data Warehouse Management: Design and maintain data warehouses on AWS, ensuring data availability, reliability, and security.
- Performance Optimization: Identify performance bottlenecks in data processing and implement optimizations to improve overall data pipeline efficiency.
- Data Governance and Security: Implement and enforce data governance policies and best practices to ensure data security, privacy, and compliance with relevant regulations.
- Continuous Improvement: Stay up to date with the latest technologies, tools, and best practices in the data engineering space. Propose and implement process improvements to enhance productivity and data quality.
Collaboration: Collaborate with cross-functional teams, including data scientists, software engineers, and business stakeholders, to understand data requirements and deliver data solutions that meet business needs.