Skip to main content
close

Search Jobs

Senior Site Reliability Engineer

Austin, TX ; Southlake, TX
Requisition ID 2026-123156 Category Engineering & Software Development Position type Regular Pay range USD $129,000.00 - $175,000.00 / Year Application deadline 2026-06-29
Apply

Your opportunity


At Schwab, you’re empowered to make an impact on your career. Here, innovative thought meets creative problem solving, helping us challenge the status quo and transform the finance industry together.

We believe in the importance of in-office collaboration and fully intend for the selected candidate for this role to work on site in the specified location(s). 

As a Senior Site Reliability Engineer within the CET SAvE organization, you will play a critical leadership role advancing the reliability, scalability, and performance of Schwab’s mobile and digital platforms. You will lead efforts to elevate production operations through modern Site Reliability Engineering practices, shaping how engineering teams design, build, and operate resilient systems at scale. 


In this role, you will drive measurable improvements in service health and client experience by defining and executing strategies that enhance observability, automation, and system resilience. You will partner cross-functionally with engineering, architecture, infrastructure, and product teams to embed reliability, scalability, and operational excellence into the full software development lifecycle. 


Success in this role requires strong problem-solving and decision-making, particularly in complex, high-scale distributed environments. You will influence technical direction, introduce best practices such as service level objectives and error budgets, and guide teams in reducing operational toil through automation and tooling innovation. Your leadership will ensure teams are aligned on reliability goals, respond effectively to production challenges, and continuously improve systems through learning and adaptation.

You will also play a key role in evolving operational maturity by strengthening on-call practices, enabling faster detection and resolution of issues, and fostering a culture of accountability, collaboration, and continuous improvement. This is an opportunity to shape enterprise-wide engineering standards while developing high-performing teams and advancing modern reliability engineering capabilities. 

Key Responsibilities: 

Production Operations & Incident Management

  • Respond to system alerts and production incident escalations
  • Lead or support incident triage, resolution, and root cause analysis
  • Drive and contribute to post-incident reviews and continuous improvement actions
  • Participate in an on-call rotation to support high-availability systems 

Observability & Monitoring

  • Ensure comprehensive monitoring coverage and effective alerting strategies across systems
  • Continuously improve visibility into system performance, reliability, and health
  • Define and evolve observability best practices, including telemetry, dashboards, and alert thresholds 

Automation & Engineering Excellence

  • Design and build automation solutions to reduce operational toil and improve resiliency
  • Develop scripts and tooling using Python and shell scripting for system maintenance and performance optimization
  • Contribute to CI/CD and deployment pipeline improvements
  • Automate processes such as service recovery, system maintenance, and certificate management 


Collaboration & Partnership 

  • Partner with development teams to understand system changes and ensure production readiness
  • Establish guardrails for monitoring, alerting, and escalation procedures
  • Embed reliability practices into the software development lifecycle 

Reliability Engineering & Continuous Improvement

  • Proactively identify system weaknesses, risks, and performance gaps
  • Drive improvements in system reliability, scalability, and resilience
  • Implement and evolve SRE best practices (SLOs, error budgets, incident reduction strategies)  


Innovation & Emerging Capabilities 

  • Explore the use of AI and automation to improve incident detection, triage, and response
  • Identify opportunities to enhance response times and reduce manual intervention 


Technical Influence & Mentorship

  • Mentor and support junior engineers in SRE best practices and automation techniques
  • Influence engineering teams to adopt proactive reliability and observability practices
  • Promote a culture of curiosity, ownership, and continuous improvement 


What you have


To ensure that we fulfill our promise of “challenging the status quo,” this role has specific qualifications that successful candidates should have: 

Required Qualifications: 

  • Bachelor of Science degree in Computer Science or a related field
  • 10+ years of experience in software development and site reliability engineering, including work with cloud-native architectures and distributed systems
  • 8+ years of experience in DevOps and/or site reliability engineering, with a focus on production operations, automation, and system reliability at scale
  • 8+ years of experience with CI/CD pipelines, observability, and monitoring/telemetry platforms
  • 5+ years of experience leading the implementation and scaling of reliability engineering practices such as service level objectives, monitoring strategies, incident reviews, and automation-driven improvements 
  • Demonstrated ability to design, develop, and maintain production-grade systems, automation frameworks, and reliability tooling across the software development lifecycle 
  • Experience supporting high-availability, distributed systems at scale
  • Deep experience with monitoring, observability, and incident management practices
  • Strong experience with automation, scripting, and operational tooling 


Preferred Qualifications:

  • Strong programming and automation experience using languages such as Python or Java, including building scalable services and APIs
  • Experience with application performance monitoring tools (e.g., Splunk preferred)
  • Experience with Kubernetes and container orchestration platforms
  • Experience with infrastructure-as-code tools such as Terraform or similar technologies
  • Familiarity with public cloud platforms (AWS, GCP, or Azure)
  • Understanding of cloud infrastructure components including compute, storage, networking, load balancing, DNS, and security architectures 
  • Proven ability to lead in fast-paced environments, influence cross-functional teams, and drive alignment on technical strategy
  • Strong communication skills with the ability to translate complex technical concepts to a variety of audiences 


What Sets You Apart:

  • Proven ability to operate effectively in complex, large-scale systems with evolving documentation
  • Strong analytical and problem-solving skills with a proactive mindset
  • Curiosity and initiative to deeply understand systems and identify improvement opportunities 
  • Ability to automate troubleshooting, enhance observability, and reduce manual intervention
  • Strong communication skills, especially during incidents and cross-team coordination 



Applicants must be currently authorized to work in the United States on a full-time basis without employer sponsorship. 


In addition to the salary range, this role is eligible for bonus or incentive opportunities. 


What’s in it for you

At Schwab, you’re empowered to shape your future. We champion your growth through meaningful work, continuous learning, and a culture of trust and collaboration—so you can build the skills to make a lasting impact. Our Hybrid Work and Flexibility approach balances our ongoing commitment to workplace flexibility, serving our clients, and our strong belief in the value of being together in person on a regular basis.

We offer a competitive benefits package that takes care of the whole you – both today and in the future:

  • 401(k) with company match and Employee stock purchase plan
  • Paid time for vacation, volunteering, and 28-day sabbatical after every 5 years of service for eligible positions
  • Paid parental leave and family building benefits
  • Tuition reimbursement
  • Health, dental, and vision insurance
Apply

Eligible Schwabbies receive

  • Medical, dental and vision benefits

  • 401(k) and employee stock purchase plans

  • Tuition reimbursement to keep developing your career

  • Paid parental leave and adoption/family building benefits

  • Sabbatical leave available after five years of employment