Top 3 Reasons To Join Us
Competitive salary
Premium health care package
7 working hours/day
The Job
Infrastructure Automation & CI/CD
- Design, implement, and maintain CI/CD pipelines for scalable backend and data services.
- Automate infrastructure provisioning using tools like Terraform, Terragrunt, or Ansible.
- Integrate automated testing, deployment workflows, and rollback strategies to support agile development.
- Maintain GitOps practices and CI/CD infrastructure using Jenkins, GitLab, and related tooling.
Kubernetes & EKS Orchestration
- Manage and optimize Kubernetes clusters on AWS (EKS), including node autoscaling, namespace management, and resource allocation.
- Build, configure, and maintain Helm charts for deployment automation and cluster app lifecycle.
- Implement best practices for Kubernetes security, multi-tenant isolation, and cluster upgrades.
- Work with developers to containerize services and deploy them reliably in production.
DataOps (Nice to have)
- Collaborate with data engineers to automate data pipeline deployment using tools like Apache Airflow, ensuring end-to-end scheduling, dependency management, and monitoring of data workflows.
- Implement and manage transformation pipelines, supporting versioned SQL models, testing, and documentation across environments.
- Integrate and manage data warehouse platforms, optimized for analytical and operational workloads.
- Support Metabase as the core business intelligence tool, including integration with data sources, permission management, and dashboard reliability.
- Ensure data quality, lineage, and observability by collaborating on validation frameworks and metrics integration.
- Operationalize and maintain databases such as MySQL, PostgreSQL, and streaming/message brokers like Kafka, ActiveMQ, and Redis.
- Automate and document backup, restore, failover, and disaster recovery strategies for critical infrastructure and data assets.
- Contribute to the deployment and tuning of data storage solutions (e.g., MinIO, S3) and metadata/catalog tools to enhance discoverability and governance.
Cloud Infrastructure Management
- Architect and manage cloud infrastructure primarily on AWS (including VPC, EC2, EKS, RDS, MSK, ElastiCache).
- Design high-availability (HA) and fault-tolerant infrastructure for critical backend and data workloads.
- Support multi-region deployment patterns and network configuration (e.g., DNS, VPN, routing, load balancing).
- Drive cost optimization efforts in compute, storage, and networking.
Monitoring, Logging & Incident Management
- Setup Logging Centralized using EFK (ElasticSearch, Fluentbit, Kibana)
- Set up monitoring tools (Prometheus, Grafana, ELK, or Datadog) for proactive alerting.
- Define and enforce SLOs/SLAs for chatbot uptime and response time.
- Lead incident response and root cause analysis for system failures.
Security & Compliance
- Ensure best practices in infrastructure security (IAM, VPC, secrets management).
- Support compliance efforts for data protection (GDPR, SOC2) in chatbot data pipelines.
- Perform ad-hoc DevOps tasks as required, including emergency patches, incident support, or rapid deployment of security updates.
Your Skills and Experience
- At least 5 year experience in Cloud/ Network/ System Engineer position;
- Bachelor's degree in computer science, Information Technology or other technical field preferred from TOP UNIVERSITY specializing in Information Technology
- Security concepts related to DNS, routing, authentication, VPN, proxy services and DDOS mitigation technologies.
- Having experience in design/implementing networks is required. HA pattern is a big advantage.
- Have knowledge and experience in cloud AWS (VPC, EC2, EKS, RDS, MSK, OPENSEARCH, ELASTICACHE, SES...)
- Have experience with EKS, K8s, and the ability to write helm charts.
- Have experience with database MySQL, PostgreSQL.
- Have experience hardening OS and troubleshooting.
- Have experience with Linux as Centos, Ubuntu.
- Have experience with ActiveMQ, Redis, and Memcache.
- Have experience in monitoring, and logging alerting tools.
- Have experience with CI/CD tools such as Jenkins and Gitlab.
- Familiar with configuration and operating Nginx/Nginx Ingress/Apache.
Plus:
- Having knowledge of Google Cloud
- Having Data Ops experience
- Having experience with Terraform/Terragrunt and Ansible
- Having knowledge and experience ElasticSearch, and Kafka
- Having knowledge and experience with postfix, FTP servers, and other services
- Having knowledge about security, checking vulnerability and fix/update OS and application
Why You'll Love Working Here
- Working hours: 9:00 - 17:00 (5 days per week); Breaking time: 12:00-13:00
- Modern working equipment (Macbook,...)
- Performance Review: 2 times/year based on employee's performance and contribution;
- Well-equipped with insurance package as stated by Labor code
- Premium PVI Health Insurance Package for all members
- Transportation allowance and free parking included.
- Technical seminars and workshops annually.
- Free snack, coffee, tea available.
- Variety of corporate events: weekly tea-break, monthly birthday parties, quarterly team building to New Year party, company trip etc.
- Friendly, open and fast-paced environment where every idea is welcomed.
- Other benefits as per stated in Vietnamese Labor Law