Data Pipeline Development: Design and implement ETL/ELT pipelines using modern data stack tools (e.g., dbt, Airflow, Airbyte, Redshift).
Data Infrastructure: Build and maintain scalable and efficient data platform on AWS, leveraging services including but not limited to S3, Glue, Athena, Lambda, and Redshift.
Data Quality and Governance: Ensure data quality, integrity, and compliance with governance standards through monitoring and validation techniques.
Collaboration: Work closely with
data scientists, analysts, and stakeholders to understand data requirements and deliver appropriate solutions.
Performance Optimization: Optimize database and query performance for large-scale data environments.
Automation and CI/CD: Implement automated workflows for deployment and monitoring of data pipelines using CI/CD practices.
Mentorship: Provide technical guidance to junior data engineers and share best practices for data engineering.
Education: Bachelor's degree or higher in Computer Science, Information Systems, Engineering, or a related field.
Experience: Minimum 5+ years of experience in data engineering or a similar role.
Technical Skills:
Proficiency in SQL and at least one programming language (e.g., Python or Java).
Hands-on experience with modern data stack tools such as dbt, Airflow, or similar.
Expertise with AWS services, including S3, Redshift, Glue, Lambda, and Athena.
Knowledge of data modeling techniques and schema design for data warehouses.
Experience with distributed systems and big data technologies (e.g., Spark, Kafka).
Cloud Experience: Proven track record of building and managing data infrastructure on AWS or GCP, Azure.
Soft Skills: Strong communication and collaboration skills, with the ability to explain technical concepts to non-technical stakeholders.