This is a hands-on Staff DevOps Engineer role responsible for designing, operating, and evolving a highly available, multi-tenant platform on AWS. You will work closely with software engineering to deploy, operate, and scale production systems while driving improvements in reliability, automation, and performance.
This role requires strong ownership of infrastructure and production systems. You will also provide technical leadership and mentorship to other DevOps engineers.
You will also help introduce and operationalize AI/LLM capabilities within the platform.
AI / LLM Systems (Emerging Area)
What You'll Be Responsible For
Production Reliability & Incident Ownership
Observability & System Insight
What You Bring
Nice to Have