← all jobs

[Remote] Manager Site Reliability Operations

Work from home Full-time role Hiring

Note: The job is a remote job and is open to candidates in USA. Mercury Insurance is a well-recognized company known for its achievements and culture, recently awarded as one of America's Best Midsize Employers for 2026. The Site Reliability Operations Manager will lead a team responsible for observability, real-time monitoring, and incident management across production platforms, ensuring operational excellence and service reliability.

Responsibilities

  • Lead the Site Reliability Operations team, including the Network Operations Center (NOC), responsible for observability, real-time monitoring, incident response, and operational excellence for key enterprise services; set direction, priorities, and success metrics for the team
  • Partner with Product Management, Engineering, SRE, and the rest of infrastructure team to embed CI/CD and release best practices into operations, including automated build/test/deploy, health checks, rollbacks, release monitoring via the NOC, and change-management guardrails
  • Oversee service reliability monitoring and incident management: ensure appropriate observability (metrics, logs, traces, dashboards), well-tuned alerting thresholds, escalation paths, and effective communications to stakeholders and leadership during incidents
  • Own and mature the Problem Management function for the team: drive root cause analysis (RCA) of recurring or high-severity incidents, standardize post-incident reviews, and ensure corrective actions and follow-ups are implemented and verified
  • Define, track, and report operational and reliability metrics (e.g., availability, MTTR, incident volume, change failure rate, deployment frequency, problem resolution time); provide regular insights and recommendations to Technology Operations leadership
  • Champion automation and “operations as code” (infrastructure as code, configuration as code, automated runbooks), working with engineering teams to reduce manual toil and improve consistency, speed, and safety of operations and releases
  • Recruit, develop, coach, and evaluate team members; provide performance feedback, make salary and promotion recommendations, and foster a high-performing, collaborative culture aligned with Mercury’s core values
  • Provide leadership coverage for 7x24 mission-critical support through the NOC and on-call rotations; ensure sustainable on-call practices, high-quality runbooks, and continuous improvement of tooling and processes

Skills

  • Bachelor's degree in computer science, Information Systems, Engineering, or related field, or equivalent combination of education and work experience
  • 7+ years of experience in IT operations, SRE, DevOps, or related roles supporting mission-critical systems
  • 3+ years of experience in a lead or management role overseeing technical teams in a 24x7 environment
  • Strong understanding of CI/CD pipelines (build, test, security scanning, deployment, rollback) and how they support reliable operations
  • Solid knowledge of observability practices and tools (metrics, logs, traces, dashboards, alerts) and how to design actionable monitoring and alerting for production systems
  • Deep familiarity with incident and problem management processes, including root cause analysis methods and post-incident review facilitation
  • Working knowledge of DevOps/SRE concepts such as SLOs/SLIs, error budgets, resilience patterns, automation to reduce toil, and blameless culture
  • Demonstrated ability to lead and influence cross-functional teams, build relationships, and collaborate effectively with engineering, InfoSec, infrastructure, and business stakeholders
  • Excellent communication skills, both written and verbal; able to clearly communicate technical issues, risks, and recommendations to technical and non-technical audiences, including senior leadership
  • Strong analytical and problem-solving skills; able to analyze operational data and trends to identify risks, drive decisions, and prioritize improvements
  • Self-motivated, adaptable, and able to operate with minimal supervision in a fast-changing environment
  • Ability to work extended hours, nights, or weekends as needed to support critical releases or resolve major incidents
  • Advanced coursework or certifications or experience in Site Reliability Engineering, DevOps, Cloud platforms, or ITIL
  • Experience leading teams that support services deployed via modern CI/CD pipelines and running on cloud and/or container platforms (e.g., Kubernetes/OpenShift, AWS). Experience integrating operations functions with DevOps/SRE teams, including shared ownership of reliability goals and metrics

Benefits

  • Competitive compensation
  • Flexibility to work from anywhere in the United States for most positions
  • Paid time off (vacation time, sick time, 9 paid Company holidays, volunteer hours)
  • Incentive bonus programs (potential for holiday bonus, referral bonus, and performance-based bonus)
  • Medical, dental, vision, life, and pet insurance
  • 401 (k) retirement savings plan with company match
  • Engaging work environment
  • Promotional opportunities
  • Education assistance
  • Professional and personal development opportunities
  • Company recognition program
  • Health and wellbeing resources, including free mental wellbeing therapy/coaching sessions, child and eldercare resources, and more

Company Overview

  • Mercury Insurance has offered quality insurance for personal auto insurance to homeowners insurance to mechanical breakdown protection. It was founded in 1962, and is headquartered in Los Angeles, California, USA, with a workforce of 5001-10000 employees. Its website is http://www.mercuryinsurance.com.
  • Company H1B Sponsorship

  • Mercury Insurance has a track record of offering H1B sponsorships, with 7 in 2026, 22 in 2025, 23 in 2024, 14 in 2023, 15 in 2022, 8 in 2021, 13 in 2020. Please note that this does not guarantee sponsorship for this specific role.
  • More open positions

    [Remote] Senior Field Marketing Manager - Healthcare

    Work from home Full-time role

    [Remote] Sr. Analyst, Data Governance

    Work from home Full-time role

    [Remote] Staff Frontend Engineer (eBay Live)

    Work from home Full-time role

    [Remote] Intertek Alchemy SaaS Account Executive

    Work from home Full-time role

    [Remote] (US) Sr Solutions Analyst - Financial

    Work from home Full-time role

    [Remote] Senior Clinical Data Management Specialist (EDC Build Lead)

    Work from home Full-time role

    [Remote] Customer Success Engineer (East)

    Work from home Full-time role

    Guest Services Coordinator (Travel)

    Work from home Full-time role

    Medical Transcriptionist for Radiology Reports

    Work from home Full-time role

    Require Online Adjunct Faculty - Gen Ed Humanities in USA

    Work from home Full-time role

    Remote Customer Service & Technical Support Representative – Home‑Based Help Desk & Client Experience Specialist

    Work from home Full-time role

    Virtual Assistant - US Shift (Remote) - VacancyGlobal

    Work from home Full-time role

    Virtual Assistant ZA

    Work from home Full-time role

    Sr. Software Engineer - Search (REMOTE)

    Work from home Full-time role

    [Remote] Implementation Consultant

    Work from home Full-time role

    [Remote] Business Development Associate

    Work from home Full-time role

    Senior Director ServiceNow Platform and Development - Remote

    Work from home Full-time role

    Executive Director, AI Platform Engineering Lead - Remote

    Work from home Full-time role

    Experienced Spanish Bilingual Remote Customer Service Representative – Delivering Exceptional Careerzynith Experiences

    Work from home Full-time role

    Growth Strategy Director

    Work from home Full-time role

    Veteran Advocate Specialist

    Work from home Full-time role