Top 3 Reasons To Join Us
Probation 2 months with 100% salary
Annual Leave: 14 days/year
13th month salary, performance/holiday bonus...
The Job
Infrastructure Management:
- Design, develop, and maintain robust and scalable data pipelines to handle large datasets using both on-premise and cloud platforms (e.g., AWS, GCP, Azure).
- Implement and manage data storage solutions, including databases and data lakes, ensuring data integrity and performance.
Data Integration:
- Integrate data from various internal and external sources such as databases, APIs, flat files, and streaming data.
- Ensure data consistency, quality, and reliability through rigorous validation and transformation processes.
ETL Development:
- Develop and implement ETL (Extract, Transform, Load) processes to automate data ingestion, transformation, and loading into data warehouses and lakes.
- Optimize ETL workflows to ensure efficient processing and minimize data latency.
Data Quality & Governance:
- Implement data quality checks and validation processes to ensure data accuracy and completeness.
- Develop data governance frameworks and policies to manage data lifecycle, metadata, and lineage.
Collaboration and Support:
- Work closely with data scientists, AI engineers, and developers to understand their data needs and provide technical support.
- Facilitate effective communication and collaboration between the AI and data teams and other technical teams.
Continuous Improvement:
- Identify areas for improvement in data infrastructure and pipeline processes.
- Stay updated with the latest industry trends and technologies related to data engineering and big data
Your Skills and Experience
- Bachelor's degree in Computer Science, Engineering, Data Science, or a related field. A Master's degree is a plus.
- 5+ years of experience in data engineering or a similar role.
- Proven experience with on-premise and cloud platforms (AWS, GCP, Azure).
- Strong background in data integration, ETL processes, and data pipeline development.
- Led the design and development of high-performance AI and data platforms, including IDEs, permission management, data pipelines, code management and model deployment systems.
- Proficiency in scripting and programming languages (e.g., Python, SQL, Bash).
- Strong knowledge of data storage solutions and databases (e.g., SQL, NoSQL, data lakes).
- Experience with big data technologies (e.g., Apache Spark, Hadoop).
- Experience with CI/CD tools (e.g., Jenkins, GitLab CI, CircleCI).
- Understanding of data engineering and MLOps methodologies.
- Awareness of security best practices in data environments.
- Excellent problem-solving skills and attention to detail.
- Managed on-premise Spark cluster for hands-on big data processing - focuses on both deployment and usage.
Why You'll Love Working Here
- Probation: 2 months with 100% salary during probation.
- Annual Leave: 14 days/year, calculated based on actual months worked.
- Allowances: Lunch, parking, mobile phone, and business trip expenses.
- Equipment: MacBook and mobile phone provided from the probation period.
- Activities & Perks: Annual teambuilding and company trip, monthly happy hours, snacks & cafeteria at the office.
- Working Hours: Flexible working arrangement, 8 hours/day at the office.
- Insurance: Full participation in Social Insurance, Health Insurance for employees and their family members.
- Bonuses & Gifts: 13th month salary, performance bonus, holiday bonuses, and Tet gifts.