Senior Site Reliability Engineer

Fidelity

Durham, North Carolina

JOB DETAILS
SKILLS
ARM (Advanced RISC Machine), Amazon Elastic Compute Cloud (EC2), Amazon Relational Database Service (RDS), Amazon Simple Storage Service (S3), Amazon Web Services (AWS), Apache JMeter, Automation, Benchmarking, Best Practices, Cloud Computing, Communication Skills, Computer Programming, Computer Science, Continuous Deployment/Delivery, Continuous Integration, Cook Dishes, Customer Response, Data Sets, Database Programming Languages, DevOps, Disaster Recovery, Distributed Computing, ERISA (Employee Retirement Income Security Act of 1974), Ecosystems, High Availability, Instrumentation, Investment Services, Java, Load Testing, Metrics, Microsoft Exchange Server, Node.js, Performance Testing, Production Systems, Python Programming/Scripting Language, Regulations, Reliability Engineering, Root Cause Analysis, Scalability Testing, Scripting (Scripting Languages), Securities, Securities Investments, Service Delivery, Software Development, Software Development Lifecycle (SDLC), Software Engineering, Splunk, State Laws and Regulations, Stress Testing, Systems Administration/Management, Systems Engineering, Team Player, Telemetry, Test Strategy, Test Tools, Testing
LOCATION
Durham, North Carolina
POSTED
3 days ago

Job Description:

Note: Fidelity will not provide immigration sponsorship for this position

The Role

Our Site Reliability Engineering group within Enterprise Infrastructure combines Operations Excellence with the Development Experience to deliver services at high scale, high availability with resilience by using automation and Infrastructure Code. We build reliability into our ecosystem by applying best practices in Resiliency Engineering, Automation, Observability, Performance testing and Chaos testing.

The team comes from diverse technical backgrounds, and the responsibilities provide the opportunity for a variety of challenges. Ideal candidates will have a background in either software engineering or systems engineering with a desire to learn the other or previous experience as an SRE. We are looking for a Systems Thinking, SRE Engineer who has helped teams scale through production insights, operational automation, developer guidance, real-time metrics, automation.

The Expertise and Skills You Bring

  • Bachelor’s degree or higher in a technology related field (e.g. Engineering, Computer Science, etc.) required, master’s degree is a plus.

  • Minimum 5 years of hands-on experience deploying and/or supporting highly distributed multi-tiered systems at a scale.

  • 3 plus years of experience in Cloud development (AWS) and migration skills; Experience with building and operating highly resilient platforms in AWS cloud environments.

  • 3-5 years of experience in software development with Python, NodeJS, or Java with a focus on SDLC and automation

  • Ensure platforms meet high availability, scalability, fault tolerance, and disaster recovery requirements.

  • Hands on experience with one or more observability tools (Datadog, Splunk, Kibana, Prometheus, Grafana, ELK/OpenSearch, Open Telemetry).

  • Hands on experience in designing, developing, and executing performance tests using K6/JMeter and other performance testing tools to ensure comprehensive performance testing.

  • Define Performance Test Strategy Document: set approach, metrics, benchmarks, baseline, user response requirements environments, technical environment and data conditions, and toolsets to use in executing the performance testing.

  • Experience in performance testing types: Load testing, Stress testing, Scalability testing, Spike testing, Volume testing, Chaos testing, Endurance/Soak testing

  • Hands-on experience with container orchestration, preferably with Kubernetes

  • Experience identifying memory leakage, connection issues and throughput bottlenecks in various technologies such as web application(s), infrastructure, and Cloud.

  • Strong knowledge of CI/CD pipelines and DevOps practices.

  • Familiarity with chaos engineering and resilience testing tools (e.g., Chaos Monkey, Gremlin).

  • Experience working in high-availability, large-scale production environments.

  • Strong programming/scripting skills in one or more:

    • Python, Java, Go, or Bash

  • Expertise in automation frameworks and tools for performance validation.

  • Experience managing systems using infrastructure as code tools (IAM, ARM, Terraform, Chef)

  • Solid understanding of Cloud Computing and DevOps concepts including CI/CD pipelines.

  • Experienced in Instrumentation with systems skills on building and operating, monitoring, logging, alerting services of distributed systems at scale.

  • Proven experience in maintaining scalability and resiliency of complex environments.

  • Proven experience in implementing advanced observability practices and techniques at scale.

  • Ability to triage, execute root cause analysis, and be decisive under pressure.

  • Experience managing and interpreting large datasets using query languages and visualization tools.

  • Proficient communication skills with an ability to reach both technical and non-technical audience.

  • Ability to work with a variety of individuals and groups, both in person and virtually, in a constructive and collaborative manner and build and maintain effective relationships.

  • Experience in design, implement, and maintain performance test frameworks, which will validate to a high degree of confidence, the production readiness of software applications and infrastructure for stability and performance. Solid understanding of AWS services and experience setting up test environments on AWS (S3, EC2, RDS, etc.).

Certifications:

Category:

Information Technology

Please be advised that Fidelity’s business is governed by the provisions of the Securities Exchange Act of 1934, the Investment Advisers Act of 1940, the Investment Company Act of 1940, ERISA, numerous state laws governing securities, investment and retirement-related financial activities and the rules and regulations of numerous self-regulatory organizations, including FINRA, among others. Those laws and regulations may restrict Fidelity from hiring and/or associating with individuals with certain Criminal Histories.

About the Company

F

Fidelity

We help over ~40 million people feel more confident in their most important financial goals, manage employee benefit programs for nearly 23,000 businesses, and support more than 3,600 advisory firms* with innovative investment and technology solutions to grow their businesses. Our diverse businesses and independence give us insight into the entire market and the stability needed to think and act for the long term as we deliver value to you.
COMPANY SIZE
10,000 employees or more
INDUSTRY
Banking
FOUNDED
1946
WEBSITE
https://jobs.fidelity.com/