Mô tả công việc
Make a proposal of AI solution in align with a set of customer requirements and goal.
Build and lead a team, enhancing overall technical capabilities and performance.
Optimize Large Language Models (LLMs) & AI models, including:
Efficient training of LLMs (DeepSpeed, FSDP, LoRA)
Deploying models with Kubernetes, Ray, Triton Inference Server
Optimizing model inference speed with ONNX, TensorRT, GGUF, vLLM
Implementing Retrieval-Augmented Generation (RAG) pipelines
Applying AI distillation and quantization techniques
Work with HPC infrastructure and distributed AI computing.
Implement system monitoring tools (htop, tcpdump, iostat, netstat).
Troubleshoot AI system performance bottlenecks.
Yêu cầu
Bachelor's or Master's degree in AI, Computer Science, Machine Learning or a related field.
3+ years of experience in LLM model development and optimization.
Hands-on experience with distributed AI training and HPC for AI workloads.
Expertise in GPU acceleration (CUDA, TensorRT, vLLM).
Deep understanding of LLM architectures (GPT, Llama, Falcon, T5, Mistral).
Experience in cloud AI deployment (Kubernetes, OpenStack, Ray, Triton).
Strong ability to troubleshoot system errors and optimize AI workloads.
English communication, reading, writing professional
Quyền lợi
13th-month salary calculated based on actual working time at INNOTECH.
PVI Healthcare Insurance for all employees
PVI Healthcare Insurance for family
Moon cake, Tet Gift
Quarterly/project kickoff team-building budget.
Resolution laptop and monitor provided for work.
Performance bonus plan.
Employee referral bonus: 2,000,000 - 10,000,000 VND (depending on level/role).
Working hours: Monday to Friday.
Annual company trips / Football club / Climbing club / Year-end party.
Learning and certification support.
Value-oriented, international working environment with a flexible culture.
Thông tin khác
Địa điểm làm việc
- 33 Ba Vì, phường 4, Quận Tân Bình
Việc làm Hồ Chí Minh
Thông tin chung