Platform & DevOps Engineer

Avtal

Austin, TX

Apply

JOB DETAILS

SKILLS

Amazon Elastic Compute Cloud (EC2), Amazon Simple Storage Service (S3), Amazon Web Services (AWS), Ansible, Application Programming Interface (API), Artificial Intelligence (AI), Automation, Best Practices, Cloud Computing, Collection Agency, Continuous Deployment/Delivery, Continuous Integration, Credit and Collections, Customer Experience, Debugging Skills, DevOps, Git, High Availability, Identify Issues, Information/Data Security (InfoSec), Linux Administration, Linux Operating System, Machine Tool, Onboarding, Performance Analysis, Problem Solving Skills, Process Management, Production Support, Production Systems, Python Programming/Scripting Language, Reliability Engineering, Revenue Growth, Root Cause Analysis, Software Engineering, Source Code/Configuration Management (SCM), System Operations, Unix Shell Programming

LOCATION

Austin, TX

POSTED

30+ days ago

MTS DevOps Engineer

Location: Austin, TX
Job Type: Full-Time
Department: Engineering / DevOps

About Avtal, Inc.

We are a VC-backed company that grew revenue 35x in the past year. We help third-party debt collection agencies deliver a digital, end-to-end self-service experience for their consumers.

About the Role

We are looking for a skilled and motivated MTS DevOps Engineer with strong experience in AWS, Linux, infrastructure automation, and CI/CD, along with practical experience supporting AI-enabled systems in production. In this role, you will be instrumental in building, maintaining, and scaling our cloud-native infrastructure, improving deployment workflows, and ensuring the reliability, security, performance, and auditability of our systems in a highly regulated environment. You will also help support the infrastructure and operational foundations needed for AI-powered applications, including secure runtime environments, observability, scalable service orchestration, and cost-conscious operations.

Responsibilities

Build and maintain infrastructure automation tools using Ansible, Terraform, Python, Go, and shell scripting
Develop and operate secure, scalable infrastructure on AWS (e.g., EC2, S3, RDS, IAM, CloudWatch)
Maintain and optimize Linux-based systems across development and production environments
Implement and manage CI/CD pipelines and automated deployment workflows
Support infrastructure for AI-powered services, including runtime reliability, operational visibility, and secure service configuration
Help enable LLM API integrations, AI service orchestration, secrets management, and secure runtime environments for AI-enabled applications
Monitor system health, performance, reliability, security, and AI service observability using modern tooling
Troubleshoot production issues, perform root cause analysis, and implement durable improvements
Collaborate with engineering teams to improve infrastructure reliability, scalability, developer productivity, and operational resilience
Document infrastructure processes, runbooks, and best practices to support knowledge sharing and onboarding

Requirements

4+ years of experience in DevOps, SRE, or Infrastructure Engineering
Proficiency in infrastructure automation and tooling using Ansible, Terraform, Python, Go, and shell scripting
Deep understanding of Linux system administration, shell scripting, and process management
Proven experience with AWS services such as EC2, S3, RDS, IAM, CloudWatch, etc.
Hands-on experience with CI/CD systems and version control (Git)
Familiarity with infrastructure needs for AI-enabled systems, such as model API integrations, service orchestration, observability, cost monitoring, or secure data handling
Strong debugging, troubleshooting, and problem-solving skills
Ability to build and operate systems with attention to reliability, security, and auditability in a highly regulated environment

Nice to Have

Experience supporting production systems that include LLM-based or other AI-powered capabilities
Familiarity with AI observability, evaluation support tooling, guardrails, and cost/performance monitoring
Experience with vector databases, embeddings pipelines, or retrieval infrastructure
Hands-on experience with infrastructure as code, including Terraform or CloudFormation
Background in Site Reliability Engineering (SRE) practices
Familiarity with monitoring and observability tools such as Prometheus, Grafana, and Kibana
Understanding of secure infrastructure design and cloud compliance best practices
Experience supporting high-availability production systems in regulated or security-conscious environments

About the Company

Avtal

Resume Resources

Free Resume Templates Free Resume Builder