ETL Developer, Dubai REF 180
Our client is a reputable global company:
This role requires proficient engineering “application” and “systems “mindset.
Requirements
• Computer Science degree. (Alternatively, 5+ years of relevant experience with a history of skills progression and demonstrated accomplishments.)
• Minimum 3+ years in experience specifically in “data engineer” or ETL developer roles
• 5+ years in SQL: Programming complex queries, dynamic SQL, stored procedures, user-defined functions, performance tuning
• 3+ years in ETL: Data pipelines, ETL concepts and frameworks (DST), data-oriented cloud architecture, data warehousing, scalable technologies (such as column-store databases and Spark)
• 2+ years in data-oriented cloud services: Preferably AWS (S3, RDS, Redshift, Athena, Glue), Databricks, or related. Azure or GCP are also acceptable.
• 3+ years with Python, Pandas, and Spark. (SAS and R beneficial.)
• 1+ years with Linux command line, shell scripting and Git / GitHub.
Job Description
You will have the ability to work in a team and as an autonomous engineer
You will be a critical thinker and have the ability to troubleshoot and escalate when faced with time- consuming challenges
You play a key role in the delivery of powerful data-driven products that enable sustainable models.
Build and maintain processes and policies supporting data transformation, data structures, metadata, dependency, and data dictionary across the pipeline
- Create and maintain optimal data model and data pipeline for our data and support teams.
- Detailed knowledge of data warehouse technical architectures, infrastructure components, ETL/ ELT, and reporting/analytic tools.
- Using Cloud based Data Warehouses such as Redshift, Big Query, Synapse etc.
- Program in SQL, Python, and cloud services APIs to automate the processing of customer data.
- Serve as the data wrangler and ETL expert for the company. Ingest, transform, cleanse and augment internal and external data assets.
- Build algorithm and data rules to support de-duplication and rule-based de-identification.
- Leverage a range of cloud services, ETL frameworks and libraries, such as AWS (EC2, RDS, S3, Lambda, Redshift, Athena, Glue), Databricks, Postgres, SparkSQL, Python, Pandas, PySpark, Apache Airflow, and DST
- Continuously learn by investigating new technologies and mentoring other team members.
- Collaborate with data architects, business SMEs to design, and develop an end-to-end data pipeline to meet fast-paced organization needs across geographic regions
- Lead and support data challenges and reconciliation of big data.
- Expand customer data points which supports additional insights
- Design and develop modular and reusable data pipelines and ETL processes to connect with various data sources
- Support application programming interfaces (APIs), tools, and third-party products to extract data from SaaS applications