- Design, implement, and maintain continuous integration and continuous deployment (CI/CD) pipelines to automate software delivery processes.
- Manage and optimize cloud infrastructure on platforms such as AWS, Azure, or Google Cloud to ensure scalability, reliability, and cost-efficiency.
- Implement and maintain observability tools (e.g., Datadog, ELK, Grafana, Prometheus, Sentry.
- Define and monitor Service Level Indicators (SLIs), Service Level Objectives (SLOs), and ensure adherence to Service Level Agreements (SLAs) to maintain high system reliability.
- Automate deployment, testing, and monitoring processes to enhance efficiency and minimize manual intervention.
- Collaborate with development teams to design and implement scalable, maintainable, and secure solutions.
- Monitor system performance, troubleshoot issues, and implement solutions to ensure high availability and minimal downtime.
- Participate in code reviews, providing feedback to
developers on best practices and DevOps principles.
- Stay up-to-date with the latest DevOps tools, technologies, and methodologies to drive continuous improvement in processes.
- Ensure security and compliance standards are integrated into all DevOps practices.
* Must have:
- 2 - 4 years of experience in a DevOps or related role, such as systems administration or
software engineering with a focus on automation.
- Proven experience in designing and managing CI/CD pipelines and cloud-based infrastructure.
- Proficiency in scripting languages such as Bash, Python, or Go.
- Strong knowledge of containerization technologies, including Docker and Kubernetes.
- Familiarity with version control systems, particularly Git.
- Experience with cloud platforms like AWS, Azure, or Google Cloud.
- Experience with infrastructure as code tools like Terraform.
- Proficiency in monitoring and logging tools such as Datadog, ELK Stack, Grafana, Prometheus, Sentry, New Relic...
- Strong understanding of SLA, SLI, and SLO concepts and their application in ensuring system reliability and performance.
- Experience in on-call rotations and incident response
- Strong problem-solving and analytical skills to address complex technical challenges.
* Nice to have:
- Certifications in cloud platforms (AWS, Azure, GCP)
- Experience working in agile development environments and familiarity with agile methodologies, especially Scrum.
- Document infrastructure, deployment processes, and incident resolutions to support team knowledge and operational consistency.
- Good command of English (Listening, Reading, Writing).