Senior Site Reliability Engineer

CÔNG TY TNHH OPTIMIZELY VIỆT NAM

Địa điểm làm việc: Hà Nội

Hết hạn: 1 năm trước

Chi tiết công việc
Giới thiệu công ty

Vị trí công việc này hiện tại đã hết hạn nộp hồ sơ, bạn có thể tham khảo thêm một số công việc liên quan phía dưới

Senior Site Reliability Engineer
Login to view salary
At Optimizely, our Site Reliability Engineers (SREs) play a crucial role in ensuring our Digital Experience Optimization platform is reliable, high-performing, and trustworthy. We have a dedicated engineering team that handles massive data pipelines, processing 10 billion events daily, and develops applications that support large-scale experimentation and collaboration workflows. Our platforms are built on AWS and GCP, utilizing technologies such as Kafka, Samza, HBase, MySQL, and Postgres. We employ tools like TravisCI, Jenkins, Docker, Kubernetes, Terraform, and Chef to build and manage our systems, employing a combination of managed and self-hosted approaches. This role offers a unique opportunity to lead our engineering organization in areas such as standardized automated infrastructure, service provisioning and orchestration, service-oriented architectural excellence, and forward-looking planning and execution of large technical projects.
Your role & responsibilities:
Define a roadmap for all engineering teams to adopt fully automated, self-service, highly scalable, cost-efficient, observable, auditable, and reliable infrastructure services as standard practice.
Drive the implementation of this roadmap across the engineering organization, collaborating with SREs and senior engineers while also actively contributing to solving critical challenges.
Provide expert technical guidance and ongoing engineering design review to teams involved in large migrations, service-oriented architecture, architectural shifts, and capacity growth.
Cultivate a metrics-driven operational culture, establishing standards for SLO definition and review, as well as logging, monitoring, alerting, and on-call practices.
Continuously improve blameless incident management processes, root cause analyses, outage prevention, and service recovery strategies across the engineering organization.
Collaborate closely with Security, Quality, and Product teams to prioritize security, privacy, compliance, reliability, and business continuity objectives in our overall roadmap.
Propose and lead significant improvements to our production systems that have a significant impact on our business and engineering teams.
Mentor and coach engineers, fostering curiosity and effective problem-solving skills.
Your skills & qualifications:
Demonstrated hands-on technical leadership and business impact with at least 6+ years of experience, combining software engineering skills with systems engineering skills to address complex automation and reliability challenges.
Deep technical expertise in cloud providers, containerization technologies, automated deployment frameworks, orchestration frameworks, monitoring, logging, alerting, system internals, networking, databases, distributed systems, and service-oriented architecture.
Proficiency in implementing load, stress, performance, and reliability testing standards at scale to enhance service, platform, and infrastructure resiliency.
Strong advocacy for openness, diversity of opinions, and inclusive discussions, fostering a wide range of ideas and perspectives to solve challenging problems.
Ability to make clear decisions and trade-offs in complex situations involving multiple opinions, needs, teams, technologies, cloud providers, and architectural settings.
Effective communication skills, engaging with stakeholders at different levels, from executives to junior engineers throughout the engineering organization.
Exemplary accountability, integrity, and resilience, maintaining focus on both long-term goals and key milestones.
Enable the engineering organization to innovate and deliver with greater speed and safety.
Education
Bachelor's degree in Computer Science or a related field, or equivalent industry experience.
Competencies
Displaying Technical Expertise
Critical Thinking
Testing and Troubleshooting
Demonstrating Initiative
Utilizing Feedback

Thông tin chung

Ngày hết hạn: 17/10/2023
Thu nhập: Thỏa thuận

Giới thiệu công ty Xem trang công ty

Overview about Optimizely Episerver At Optimizely (formerly Episerver), we're on a mission to help people unlock their digital potential. With our leading digital experience platform (DXP), we equip teams with the tools and insights they need to create and optimize in new and novel ways. Now,...

Quy mô công ty

Từ 101 - 500 nhân viên

Chia sẻ: