Site Reliability Engineer

Transform9

Birmingham, AL

JOB DETAILS
SKILLS
Amazon Web Services (AWS), Ansible, Bash Scripting, Best Practices, Cloud Computing, Computer Science, Continuous Deployment/Delivery, Continuous Integration, DevOps, Docker, Go Programming Language (Golang), Health Plan, Healthcare, Healthcare Providers, Identify Issues, Incident Response, Microsoft Windows Azure, On Call, Performance Analysis, Problem Solving Skills, Python Programming/Scripting Language, Reliability Engineering, Root Cause Analysis, Scalable System Development, Scripting (Scripting Languages), Software Engineering, System Architecture, Systems Reliability, Systems Scalability, Telephone Skills, Training/Teaching, User Interface/Experience (UI/UX)
LOCATION
Birmingham, AL
POSTED
30+ days ago

At Transform9, we are dedicated to transforming healthcare access and patient communication through our innovative conversational agent platform. Our mission is to provide seamless experiences for patients and healthcare providers alike. To support our growing platform, we are seeking a Site Reliability Engineer to ensure the health, performance, and reliability of our systems. In this role, you will work collaboratively with the development and operations teams to build and maintain scalable infrastructure, automate processes, and enhance the overall availability of our services. Your expertise will be vital in creating a robust environment that can support our ambitious growth in the healthcare sector.


Responsibilities

  • Design, implement, and maintain scalable and reliable systems to support the Transform9 platform and services.
  • Monitor system performance, respond to incidents, and troubleshoot issues to ensure optimal uptime and reliability.
  • Build and manage CI/CD pipelines to facilitate smooth deployments and automate workflows.
  • Collaborate with development teams to establish best practices in system architecture, deployment, and monitoring.
  • Implement observability solutions to gain insights into system performance and user experience.
  • Participate in on-call rotations to respond to system alerts, perform root cause analysis, and implement remediation strategies.

Requirements

  • Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent experience.
  • 5+ years of experience in site reliability engineering, DevOps, or a related software engineering role.
  • Strong understanding of cloud infrastructure (AWS, Azure, etc.) and container orchestration technologies (Kubernetes, Docker).
  • Experience with infrastructure as code tools (Terraform, Ansible, etc.) for automating deployments.
  • Proficiency in scripting and programming languages such as Python, Go, or Bash.
  • Familiarity with monitoring and observability tools (Prometheus, Grafana, ELK stack).
  • Excellent problem-solving skills and the ability to work effectively in high-pressure situations.

Benefits

  • Health Care Plan (Medical, Dental & Vision)
  • Retirement Plan (401k, IRA)
  • Paid Time Off (Vacation, Sick & Public Holidays)
  • Family Leave (Maternity, Paternity)
  • Training & Development
  • Free Food & Snacks

About the Company

T

Transform9