- Design, develop, and maintain scalable data pipelines and ETL processes to collect, process, and store large volumes of data from diverse sources
- Develop and maintain ETL workflows to extract, transform, and load data from source systems into the data warehouse, ensuring data quality and consistency
- Design dimensional data models and database schemas optimized for analytical queries and reporting needs
- Implement and optimize data warehouse architectures and indexing strategies for efficient data storage and retrieval
- Collaborate with cross-functional teams to understand data requirements and deliver innovative solutions for data ingestion, transformation, and storage; with
business analysts and stakeholders to understand reporting and analytics requirements and translate them into technical solutions
- Monitor and troubleshoot data pipeline performance issues, implementing solutions to improve reliability and efficiency
- Implement and optimize distributed computing frameworks such as Apache Hadoop, Apache Spark, or Apache Flink to handle big data processing tasks efficiently