Vị trí công việc này hiện tại đã hết hạn nộp hồ sơ, bạn có thể tham khảo thêm một số công việc liên quan phía dưới
Mô tả công việc
Tóm tắt công việc
1. About the Role:
We are seeking a highly skilled Site Reliability Engineer with experience applying GenAI to automate and enhance the reliability of complex data platforms in Data Division. You will be responsible for building self-healing infrastructure, AI-powered observability, and automating incident response across data pipelines (e.g., Databricks, Glue, Kafka, Flink). This is a high-impact role where you will shape the future of data reliability at Techcombank, mentor engineers, and lead initiatives that span multiple teams and domains.
2. Key Responsibilities:
Platform Reliability & Automation
• Design, implement, and operate reliable, scalable, and observable data platforms.
• Automate incident triage, remediation, and postmortems using GenAI-powered tools.
• Develop intelligent runbooks and self-healing workflows using LLMs.
GenAI-Enabled SRE Practices
• Build and integrate GenAI copilots for on-call support, anomaly detection, and RCA (root cause analysis).
• Fine-tune or prompt engineer LLMs for specific use cases like summarizing logs, interpreting metrics, or generating remediation steps.
• Leverage vector databases (e.g., FAISS, Weaviate) to retrieve telemetry and incident history for GenAI prompts.
Observability & Anomaly Detection
• Integrate GenAI with observability tools (e.g., Datadog, Prometheus, Grafana, OpenTelemetry).
• Build systems for natural language querying of platform health and pipeline performance.
• Collaborate with data engineers to monitor SLIs/SLOs across ingestion, transformation, and delivery layers.
CI/CD & Risk Management
• Integrate GenAI into CI/CD pipelines to generate blast radius analyses and deployment guardrails.
• Use LLMs to assess the risk of configuration or schema changes before production rollout.
• Automate validation and rollback strategies based on historical outcomes.
WHY BECOME IT/DATA EXPERTS AT TECHCOMBANK?
Investing over 500 million USD to develop large-scale IT projects, Techcombank is one of the leading bank in Technology trends in Vietnam
You will grow with Techcombank by having the opportunity to learn from top experts from across the world
Techcombank provides a rewarding remuneration structure that commensurate with your achievement and contribution
Techcombank is the Top 2 Best place to work in the banking industry where you can experience various exciting activities throughout the year: Company anniversary, Team building, Active Saturday , Year End Party, etc.
Yêu cầu
• Bachelor's degree in computer science,
software engineering or information technology
• Good at English
• 5+ years in SRE, DevOps, or Data Engineering roles with strong focus on automation and observability.
• Solid experience in cloud-native data platforms (e.g., Databricks, Glue, Kafka, Flink, S3, Lambda).
• Proven experience using or integrating GenAI tools (OpenAI, Claude, HuggingFace Transformers).
• Proficiency in Python or Scala; experience with Spark and Airflow a plus.
• Familiarity with LLM techniques: prompt engineering, embeddings, retrieval-augmented generation (RAG).
• Hands-on experience with monitoring and alerting tools (e.g., Prometheus, Grafana, Datadog).
• Experience with Infrastructure as Code (e.g., Terraform, CloudFormation).
Preferred:
• Experience fine-tuning LLMs or integrating GenAI agents into production systems.
• Familiarity with vector databases (e.g., Pinecone, Qdrant, FAISS).
• Knowledge of data quality frameworks and lineage tools (e.g., DeeQu, Great Expectations, Amundsen, Unity Catalog).
• Understanding of ITIL/incident management frameworks.
• Strong communication and documentation skills, especially in on-call and postmortem environments.
Thông tin khác
Java
Python
Apache Spark
Unity
AWS Lambda
Observability
Scala
DevOps
Grafana
OpenAI
Amazon S3
Apache Kafka
ITIL
DataDog
Terraform
AWS CloudFormation
Prometheus
Apache Airflow
Apache Flink
Databricks
AWS Glue
IaC
FAISS
LLM
Vector
GenAI
RAG
Claude
Thông tin chung
- Ngày hết hạn: 24/11/2025
- Thu nhập: Thỏa Thuận