Senior Manager, Site Reliability Engineering (SRE)

To apply for this job please sign in or enter your email below.

Collective Health · San Francisco, CA

Information Technology
Health & Well-Being
Partners & Advocates
$189,000 - $237,000 Per Year
Posted 2 weeks ago

S3
EC2
Kafka
Kubernetes
Security
Docker
Jenkins
Report an Issue

We all depend on healthcare throughout our lifetimes, for ourselves, and our families and friends, but it is notoriously difficult to navigate and understand. As an industry that comprises 20% of the US economy we think healthcare should work better for all of us. At Collective Health we believe it’s time for a new day in healthcare where as members we are informed and empowered to make the right care choices when the decisions are urgent and critical. 

As a Senior Manager of Site Reliability Engineering (SRE), you will lead a team of SREs who are responsible for the day-to-day operation, monitoring, stability, availability, performance, and support of our cloud-based infrastructure.

Infrastructure reliability is critical to Collective Health and its customers, as it’s the foundation on which our healthcare software and services are built.  You’ll work with engineering and other stakeholders to evolve, establish, and monitor key SLIs, SLOs, and SLAs for our cloud infrastructure.  This also includes the definition and monitoring of meaningful and reasonable error budgets, cost optimization, capacity planning, and incident management.  You will establish and maintain Key Performance Indicators and dashboards for the overall health of our cloud-based infrastructure.

With a focus on continuous improvement for our deployed cloud footprint, you and your team will drive root cause analyses and learnings from infrastructure issues through post-mortem processes and reviews in order to ensure continuous improvement of the overall platform. You will identify and implement cloud infrastructure enhancements as required to maintain the performance of systems in response to business growth.

The Senior Manager of SRE is a hands-on technical role and requires a thorough understanding of all components of modern cloud-based infrastructure. The Senior Manager will also focus on coaching an existing team of Site Reliability Engineers on best practices and processes toward creating a world-class SRE function.

What you'll do:

  • Lead a team of SREs who are responsible for the day-to-day operation, monitoring, stability, availability, performance, and support of our cloud-based infrastructure.

To be successful in this role, you'll need:

  • Bachelor's degree in Computer Science, Management Information Systems, or equivalent practical experience.
  • 5+ years of experience in managing reliability focused software engineering teams for a cloud-based infrastructure.
  • 7+ years of hands-on SRE experience
  • Experience in establishing a metrics-driven SRE Network Operating Center, optimizing for site performance, availability, and incident response time
  • Experience leading high-impact cloud-based infrastructure initiatives.
  • Familiarity with a wide range of cloud-based infrastructure technologies, such as those used in container orchestration, data orchestration, business middleware, security, and governance. This includes AWS (S3, EC2, RDS, more), Kubernetes, Docker, Kafka, Jenkins, and Grafana.
  • Demonstrated organizational, communication, leadership, and internal customer service skills.
  • Proven ability to build a strong and diverse Site Reliability Engineering culture and work environment that is both supportive and outcome-focused

Pay Transparency Statement 

This is a hybrid position based out of our San Francisco office, with the expectation of being in office at least two weekdays per week. #LI-hybrid 

The actual pay rate offered within the range will depend on factors including geographic location, qualifications, experience, and internal equity. In addition to the salary, you will be eligible for stock options and benefits like health insurance, 401k, and paid time off. Learn more about our benefits at https://jobs.collectivehealth.com/#benefits.

San Francisco, CA Pay Range
$189,000$237,000 USD

 

About Collective Health

Collective Health is the leading health benefits platform that brings together medical, dental, vision, pharmacy, and program partners into an integrated solution that better enables employees and their families to understand, navigate, and pay for healthcare. By reducing the administrative lift of delivering health benefits, providing an intuitive member experience, and helping control costs and improve outcomes, the company guides employees toward healthier lives and companies toward healthier bottom lines.

We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. Collective Health is committed to providing support to candidates who require reasonable accommodation during the interview process. If you need assistance, please contact [email protected].

Privacy Notice

For more information about why we need your data and how we use it, please see our privacy policy: https://collectivehealth.com/privacy-policy/.

Related Jobs

Strategy Manager - Lark Health
Mountain View, CA - Posted 4 weeks ago
Systems Analyst - CommonSpirit Health
Omaha, NE - Posted 4 weeks ago
View more open tech jobs in San Francisco, CA
Be the first to see new Senior Manager, Site Reliability Engineering (SRE) jobs

Save this search to get an email when new jobs match this search.

Create Email Alert