Principal Site Reliability Engineer SRE

Ally Financial Inc

Charlotte, NC(remote)

JOB DETAILS
SKILLS
Algorithms, Amazon Web Services (AWS), Artificial Intelligence (AI), Automation, Budgeting, Cloud Computing, Communication Skills, Compensation and Benefits, Computer Programming, Content Delivery Network (CDN), Continuous Deployment/Delivery, Continuous Integration, Customer Support/Service, DNS (Domain Name System), DevOps, Disaster Recovery, Distributed Computing, Diversity, Diving, Financial Services, GitHub, Go Programming Language (Golang), High Availability, Identify Issues, Incident Management, Incident Response, Jenkins, Large-Scale Systems, Load Balancing, Machine Tool, Mentoring, Microservices, Operational Improvement, Operational Strategy, Problem Solving Skills, Process Improvement, Production Systems, Python Programming/Scripting Language, Record Keeping, Reliability Engineering, Root Cause Analysis, Safety/Work Safety, Startup, Student Loans, System Integration (SI), Systems Engineering, Systems Reliability, Systems Scalability, Team Player, Writing Skills
LOCATION
Charlotte, NC
POSTED
30+ days ago

General Information ----------------

Ally Financial only succeeds when its people do - and thats more than some cliché people put on job postings. We live this stuff. We see our people as well people - with interests, families, friends, dreams, and causes that are all important to them. Our focus is on the health and safety of our teammates as well as work-life balance and diversity and inclusion. From generous benefits to a variety of employee resource groups, we strive to build paths that encourage employees to stretch themselves professionally. We want to help you grow, develop, and learn new things. Youre constantly evolving, so shouldnt your opportunities be too?

Work Schedule -------------

Ally designates roles as 1 fully on-site, 2 hybrid, or 3 fully remote. Hybrid roles are generally expected to be in the office a certain number of days per week as indicated by your manager. Your hiring manager will discuss this roles specific work requirements with you during the hiring process. All work requirements are subject to change at any time based on leader discretion and/or business need.

The Opportunity ----------------

At Ally, you get a startup feel but experience the benefits of a company thats worked out the kinks and is fulfilling its purpose. Were always evolving and see that as a good thing. From owning our work to seeing its impact in the real world, our team is relentless in finding new ways technology can help make experiences better and help people. We are problem solvers, we value diverse thinking, we support one another, and we challenge ourselves to think bigger in the journey to deliver customer-obsessed tech solutions.

At this time, Ally will not sponsor a new applicant for employment authorization for this position.

The Work ---------

Design and implement highly available, scalable infrastructure systems that support mission-critical production services, including automated deployment pipelines, observability platforms, and disaster recovery. Lead incident response and postmortem processes, diving deep into complex distributed systems failures to identify root causes and drive systemic reliability improvements across engineering teams. Develop and maintain service level objectives (SLOs) and error budgets using data-driven approaches to balance feature velocity with system reliability and guide organizational decision-making. Build tooling and automation to eliminate toil, improve operational efficiency, and enable engineering teams to safely deploy and operate services with minimal manual intervention.

The Skills You Bring -------------------

### Minimum Qualifications

  • 7 years of relevant experience
  • Bachelors degree in relevant fields of study or equivalent

### Preferred Qualifications

  • 5 years of experience in site reliability engineering, systems engineering, or DevOps roles with a proven track record of maintaining large-scale production systems
  • Deep expertise in cloud (AWS), including infrastructure as code tools like Terraform, CloudFormation, or Pulumi
  • Experience defining and measuring SLIs (Service Level Indicators), SLOs (Service Level Objectives), and error budgets, and using them to drive reliability improvements and inform product decisions

### Additional Skills

  • Proficiency in AI development
  • Strong programming skills in languages such as Python, Go, or Node, with the ability to write production-quality code for automation tooling and system integration
  • Extensive experience with container orchestration (ECS) or similar, and microservices architectures in production environments
  • Proficiency with observability and monitoring tools (Dynatrace, Prometheus, Grafana, Datadog, New Relic, or similar), and experience building comprehensive monitoring and alerting systems
  • Solid understanding of networking concepts, load balancing, CDNs, DNS, and distributed systems principles, including consensus algorithms and failure modes
  • Hands-on experience with CICD (Continuous Integration and Continuous Deployment) pipelines and GitOps workflows using tools like Jenkins, GitHub Actions, ArgoCD, or CircleCI
  • Strong incident management and troubleshooting skills, with the ability to quickly diagnose and resolve complex production issues under pressure
  • Excellent communication and collaboration skills, with the ability to influence technical direction across multiple teams and mentor engineers at various levels

Compensation and Benefits -------------------------

Allys compensation program offers market-competitive base pay and pay-for-performance incentives, bonuses based on achieving personal and company goals. Our Total Rewards program includes industry-leading compensation and benefits, plus additional incentives that are designed to meet your needs and those of your family. This includes:

  • Time Away Program: starts at 20 paid time off days in addition to 11 paid holidays and 8 hours of volunteer time off yearly
  • Planning for the Future: industry-leading 401K retirement savings plan with matching and company contributions, student loan pay downs, and 529 educational savings up assistance programs
  • Supporting your Health & Well-being: flexible health and insurance options, including medical, dental, and vision, employee spouse and child life insurance, short- and long-term disability, pre-tax Health Savings Account with employer contributions, and a total well-being program
  • Building a Family: adoption surrogacy and fertility assistance, paid parental and caregiver leave, Dependent Day Care FSA, back-up child and adult elder care days, and childcare discounts
  • Work-Life Integration: other benefits, including a Mentally Fit Employee Assistance Program, subsidized and discounted Weight Watchers program, and other employee discount programs

Other compensations, depending on the role, may include travel allowances, relocation assistance, a signing bonus, and/or equity.

Who We Are ------------

Ally Financial is a customer-centric, leading digital financial services company with passionate customer service and innovative financial solutions. We are relentlessly focused on Doing it Right and being a trusted financial-services provider to our consumer, commercial, and corporate customers. For more information, visit www.ally.com.

Ally is an equal opportunity employer committed to diversity and inclusion in the workplace. All qualified applicants will receive consideration for employment without regard to age, race, color, sex, religion, national origin, disability, sexual orientation, gender identity, or expression, pregnancy status, marital status, military or veteran status, genetic disposition, or any other reason protected by law. We are committed to working with and providing reasonable accommodation to applicants with physical or mental disabilities.

About the Company

A

Ally Financial Inc