4+ years of experience building production inference services and enterprise integrations using application programming interfaces, Representational State Transfer (REST), GraphQL, event-driven patterns, continuous integration and continuous deployment, infrastructure as code, Docker, Kubernetes, and monitoring tools. This role is hands-on and delivery-oriented: you will ship production pipelines and services that support model training, real-time inference, and LLM applications using Claude-, GPT/Codex-, and Gemini-class models, and more implemented with strong governance, observability, and cost/performance discipline.