Senior Site Reliability Engineer
CÔNG TY TNHH OPTIMIZELY VIỆT NAM
Địa điểm làm việc: Hà Nội
Hết hạn: 17/10/2023
- Chi tiết công việc
- Giới thiệu công ty
Vị trí công việc này hiện tại đã hết hạn nộp hồ sơ, bạn có thể tham khảo thêm một số công việc liên quan phía dưới
Senior Site Reliability Engineer
Login to view salary
At Optimizely, our Site Reliability Engineers (SREs) play a crucial role in ensuring our Digital Experience Optimization platform is reliable, high-performing, and trustworthy. We have a dedicated engineering team that handles massive data pipelines, processing 10 billion events daily, and develops applications that support large-scale experimentation and collaboration workflows. Our platforms are built on AWS and GCP, utilizing technologies such as Kafka, Samza, HBase, MySQL, and Postgres. We employ tools like TravisCI, Jenkins, Docker, Kubernetes, Terraform, and Chef to build and manage our systems, employing a combination of managed and self-hosted approaches. This role offers a unique opportunity to lead our engineering organization in areas such as standardized automated infrastructure, service provisioning and orchestration, service-oriented architectural excellence, and forward-looking planning and execution of large technical projects.
Your role & responsibilities:
Define a roadmap for all engineering teams to adopt fully automated, self-service, highly scalable, cost-efficient, observable, auditable, and reliable infrastructure services as standard practice.
Drive the implementation of this roadmap across the engineering organization, collaborating with SREs and senior engineers while also actively contributing to solving critical challenges.
Provide expert technical guidance and ongoing engineering design review to teams involved in large migrations, service-oriented architecture, architectural shifts, and capacity growth.
Cultivate a metrics-driven operational culture, establishing standards for SLO definition and review, as well as logging, monitoring, alerting, and on-call practices.
Continuously improve blameless incident management processes, root cause analyses, outage prevention, and service recovery strategies across the engineering organization.
Collaborate closely with Security, Quality, and Product teams to prioritize security, privacy, compliance, reliability, and business continuity objectives in our overall roadmap.
Propose and lead significant improvements to our production systems that have a significant impact on our business and engineering teams.
Mentor and coach engineers, fostering curiosity and effective problem-solving skills.
Your skills & qualifications:
Demonstrated hands-on technical leadership and business impact with at least 6+ years of experience, combining software engineering skills with systems engineering skills to address complex automation and reliability challenges.
Deep technical expertise in cloud providers, containerization technologies, automated deployment frameworks, orchestration frameworks, monitoring, logging, alerting, system internals, networking, databases, distributed systems, and service-oriented architecture.
Proficiency in implementing load, stress, performance, and reliability testing standards at scale to enhance service, platform, and infrastructure resiliency.
Strong advocacy for openness, diversity of opinions, and inclusive discussions, fostering a wide range of ideas and perspectives to solve challenging problems.
Ability to make clear decisions and trade-offs in complex situations involving multiple opinions, needs, teams, technologies, cloud providers, and architectural settings.
Effective communication skills, engaging with stakeholders at different levels, from executives to junior engineers throughout the engineering organization.
Exemplary accountability, integrity, and resilience, maintaining focus on both long-term goals and key milestones.
Enable the engineering organization to innovate and deliver with greater speed and safety.
Education
Bachelor's degree in Computer Science or a related field, or equivalent industry experience.
Competencies
Displaying Technical Expertise
Critical Thinking
Testing and Troubleshooting
Demonstrating Initiative
Utilizing Feedback
Login to view salary
At Optimizely, our Site Reliability Engineers (SREs) play a crucial role in ensuring our Digital Experience Optimization platform is reliable, high-performing, and trustworthy. We have a dedicated engineering team that handles massive data pipelines, processing 10 billion events daily, and develops applications that support large-scale experimentation and collaboration workflows. Our platforms are built on AWS and GCP, utilizing technologies such as Kafka, Samza, HBase, MySQL, and Postgres. We employ tools like TravisCI, Jenkins, Docker, Kubernetes, Terraform, and Chef to build and manage our systems, employing a combination of managed and self-hosted approaches. This role offers a unique opportunity to lead our engineering organization in areas such as standardized automated infrastructure, service provisioning and orchestration, service-oriented architectural excellence, and forward-looking planning and execution of large technical projects.
Your role & responsibilities:
Define a roadmap for all engineering teams to adopt fully automated, self-service, highly scalable, cost-efficient, observable, auditable, and reliable infrastructure services as standard practice.
Drive the implementation of this roadmap across the engineering organization, collaborating with SREs and senior engineers while also actively contributing to solving critical challenges.
Provide expert technical guidance and ongoing engineering design review to teams involved in large migrations, service-oriented architecture, architectural shifts, and capacity growth.
Cultivate a metrics-driven operational culture, establishing standards for SLO definition and review, as well as logging, monitoring, alerting, and on-call practices.
Continuously improve blameless incident management processes, root cause analyses, outage prevention, and service recovery strategies across the engineering organization.
Collaborate closely with Security, Quality, and Product teams to prioritize security, privacy, compliance, reliability, and business continuity objectives in our overall roadmap.
Propose and lead significant improvements to our production systems that have a significant impact on our business and engineering teams.
Mentor and coach engineers, fostering curiosity and effective problem-solving skills.
Your skills & qualifications:
Demonstrated hands-on technical leadership and business impact with at least 6+ years of experience, combining software engineering skills with systems engineering skills to address complex automation and reliability challenges.
Deep technical expertise in cloud providers, containerization technologies, automated deployment frameworks, orchestration frameworks, monitoring, logging, alerting, system internals, networking, databases, distributed systems, and service-oriented architecture.
Proficiency in implementing load, stress, performance, and reliability testing standards at scale to enhance service, platform, and infrastructure resiliency.
Strong advocacy for openness, diversity of opinions, and inclusive discussions, fostering a wide range of ideas and perspectives to solve challenging problems.
Ability to make clear decisions and trade-offs in complex situations involving multiple opinions, needs, teams, technologies, cloud providers, and architectural settings.
Effective communication skills, engaging with stakeholders at different levels, from executives to junior engineers throughout the engineering organization.
Exemplary accountability, integrity, and resilience, maintaining focus on both long-term goals and key milestones.
Enable the engineering organization to innovate and deliver with greater speed and safety.
Education
Bachelor's degree in Computer Science or a related field, or equivalent industry experience.
Competencies
Displaying Technical Expertise
Critical Thinking
Testing and Troubleshooting
Demonstrating Initiative
Utilizing Feedback
Thông tin chung
- Ngày hết hạn: 17/10/2023
- Thu nhập: Thỏa thuận
Giới thiệu công ty
Xem trang công ty
Overview about Optimizely Episerver At Optimizely (formerly Episerver), we're on a mission to help people unlock their digital potential. With our leading digital experience platform (DXP), we equip teams with the tools and insights they need to create and optimize in new and novel ways. Now,...
Quy mô công ty
Từ 101 - 500 nhân viên
Việc làm tương tự
Công Ty CP Phát Triển Phần Mềm Á Đông - Orient Software Development Corp.
You'll love it
17/02/2025
Hà Nội, Hồ Chí Minh
Công Ty Cổ Phần Công Nghệ VMO Holdings
1,800 - 3,000 USD
01/02/2025
Hà Nội
Công ty Cổ phần Công nghệ Bằng Hữu
Thoả thuận
31/01/2025
Hà Nội
Công Ty TNHH Money Forward Việt Nam
You'll love it
26/02/2025
Hà Nội
hỗ trợ doanh nghiệp
Giải thưởng
của chúng tôi
Top 3
Nền tảng số tiêu biểu của Bộ
TT&TT 2022.
Top 15
Startup Việt xuất sắc 2019 do VNExpress tổ chức.
Top 10
Doanh nghiệp khởi nghiệp sáng tạo Việt Nam - Hội đồng tư vấn kinh doanh ASEAN bình chọn.
Giải Đồng
Sản phẩm công nghệ số Make In Viet Nam 2023.