Position Type: Regular

Your opportunity


At Schwab, you’re empowered to make an impact on your career. Here, innovative thought meets creative problem solving, helping us “challenge the status quo” and transform the finance industry together.

 

The Sr Manager, Reliability Engineering and Operations is enthusiastic about leading technology teams responsible for delivering exceptional application and production support. You need to have a proven track record of critical thinking skills with laser focus on pragmatic problem solving and production support, and customer satisfaction. We require strong ethics, critical thinking skills, and the ability to partner with and influence business partners, product teams, and technologists across the organization. The right candidate will have a strong background in leading and developing 24 X 7 support teams.

 

Essential Functions:

  • Leadership & Management:
    • Leading and mentoring a Production Operations team for Schwab’s Workplace Financial Services Technology team fostering a culture of continuous improvement and innovation
    • Collaborating with cross-functional teams to ensure alignment on reliability and performance goals
    • Hands-on technical leader who will lead the team from the front and be able to inspire thought leadership in the team
    • Identifying tactical and strategic opportunities to improve service health, performance, reliability, and telemetry
    • Driving a shift-left mindset and influence architectural decisions to ensure resiliency and scale at the outset of software development process
    • Advocating automation to ensure teams are following patterns to ensure repeatability, consistency, and portability
    • Identifying toil and technical debt, develop a comprehensive plan and lead the team through the process of execution
  • Reliability & Performance:
    • Conducting post-mortem reviews to identify areas for improvement and implement solutions to enhance system reliability
    • Implementing and promoting performance engineering practices to ensure optimal system performance
    • Developing and executing strategies for destructive testing to identify potential points of failure and improve system resilience
    • Working closely with development team to define a sustainable operating model for Mobile applications focusing on platform scale, availability, fault tolerance and performance
    • Leading the team with data driven mindset focusing on addressing key performance metrics such as MTTD, MTTR, Availability in close collaboration with development teams
  • Production Engineering & Operational Support:
    • Overseeing production engineering efforts to ensure systems are designed for operational excellence and reliability
    • Providing technical guidance as needed during incidents and daily work
    • Providing leadership around incident management and root cause analysis to resolve production issues and prevent recurrence
    • Establishing and maintaining operational support practices, including monitoring, alerting, and incident response
    • Leading the team in their SRE maturity journey
  • Continuous Improvement:
    • Driving continuous improvement initiatives in reliability, performance, automation, and operational support
    • Staying current with industry trends and best practices to ensure our systems and processes remain in line with SRE tenets

What you have


Required Qualifications:

      • 10+ years of experience running and managing 24/7/365 application support teams responsible for enterprise applications, infrastructure, and systems
      • 10+ years of experience in measuring, tracking, improving, and reporting on SLO/SLA’s/KPI’s
      • 7+ years of experience supporting enterprise applications in production
      • 5+ years of experience working Enterprise ITSM Business Processes
      • ITIL Experience with Enterprise Systems that includes but not limited to:
        • Event and Incident Management
        • Release and deployment
        • Enterprise Change Management experience
      • Available for after-hours calls/incident management
      • Experience managing multi-shift-based teams
      • Recent experience leading operations organization that focuses on event and incident management
      • Experience in monitoring tools with a focus on ITIL capabilities
      • Experience with GitHub, Bamboo, Bitbucket, Splunk, ThousandEyes, and AppDynamics 

What’s in it for you

At Schwab, we’re committed to empowering our employees’ personal and professional success. Our purpose-driven, supportive culture, and focus on your development means you’ll get the tools you need to make a positive difference in the finance industry. Our Hybrid Work and Flexibility approach balances our ongoing commitment to workplace flexibility, serving our clients, and our strong belief in the value of being together in person on a regular basis.

We offer a competitive benefits package that takes care of the whole you – both today and in the future:

  • 401(k) with company match and Employee stock purchase plan
  • Paid time for vacation, volunteering, and 28-day sabbatical after every 5 years of service for eligible positions
  • Paid parental leave and family building benefits
  • Tuition reimbursement
  • Health, dental, and vision insurance