Site Reliability Engineer

ExtendMyTeam

Austin, TX

JOB DETAILS
SKILLS
Amazon Web Services (AWS), Automation, Best Practices, Business Operations, Cloud Computing, Continuous Deployment/Delivery, Continuous Integration, Cross-Functional, Data Management, Database Extract Transform and Load (ETL), DevOps, Enterprise Applications, Identify Issues, Incident Response, Operational Improvement, Operational Support, Operations Processes, Performance Analysis, Performance Management, Performance Tuning/Optimization, Problem Solving Skills, Production Control, Production Support, Production Volume, Reliability Engineering, Root Cause Analysis, Scalable System Development, Snowflake Schema, Splunk, Team Player
LOCATION
Austin, TX
POSTED
4 days ago

Join a growing technology organization focused on building scalable cloud and data platform solutions that support analytics, reporting, and operational performance across enterprise applications. This team is investing heavily in platform reliability, automation, and operational excellence to support modern data infrastructure and high-volume production workloads.

This is an opportunity to work on highly visible infrastructure and reliability initiatives that directly impact platform stability, scalability, performance, and engineering efficiency.

Position Summary

We are seeking a Senior Site Reliability Engineer (SRE) to support and optimize cloud-based data platform infrastructure with a strong focus on Snowflake environments, automation, and operational reliability. This individual will partner closely with Data Engineering, DevOps, and Security teams to improve platform performance, support production-grade data pipelines, and establish scalable operational best practices.

This is a hands-on engineering role focused on infrastructure reliability, automation, monitoring, and production support within modern cloud-native environments.

Responsibilities

  • Support and optimize Snowflake environments for reliability, scalability, and cost efficiency

  • Build and maintain automation, CI/CD pipelines, and infrastructure-as-code solutions

  • Monitor platform performance and troubleshoot production issues across data systems and pipelines

  • Support ETL/ELT workflows and cloud-based data ingestion processes

  • Work within Kubernetes/containerized environments and cloud platforms such as AWS

  • Partner cross-functionally with engineering teams to improve platform reliability and operational processes

  • Participate in incident response, root cause analysis, and continuous operational improvements

Required Experience / Ideal Background

  • 5+ years of experience in SRE, DevOps, Platform Engineering, or related infrastructure roles

  • Hands-on experience with Snowflake administration, monitoring, and performance optimization

  • Strong experience with Kubernetes, Terraform, CI/CD pipelines, and cloud infrastructure environments

  • Proficiency with Python scripting and SQL

  • Experience supporting distributed systems, cloud infrastructure, and production-grade data platforms

  • Familiarity with Airflow, Kafka, Helm, ArgoCD, or similar technologies preferred

  • Experience with monitoring and observability tools such as Datadog, Splunk, or CloudWatch

  • Strong problem-solving, communication, and collaboration skills

Additional Information

  • Hybrid opportunity located in Austin, TX or Cary, NC

  • Applicants must be authorized to work in the U.S. without sponsorship

  • Competitive compensation, benefits, flexible time off, and career development opportunities




About the Company

E

ExtendMyTeam