AI Ops Senior Technical Architect

Yantran LLC

Richardson, TX

JOB DETAILS
SKILLS
Amazon Web Services (AWS), Application Programming Interface (API), Artificial Intelligence (AI), Automation, Budgeting, Cloud Architecture, Distributed Computing, Event Correlation, GCP (Good Clinical Practices), High Availability, Incident Management, Maintain Compliance, Mentoring, Metrics, Microservices, Microsoft Windows Azure, Noise Reduction, Organizational Skills, Performance Engineering, Reliability Engineering, Splunk, Technical Operations, Telemetry
LOCATION
Richardson, TX
POSTED
30+ days ago
AI Ops Senior Technical Architect
Location: Richardson, TX – Onsite – Candidate should go to the office 3 days a Week
Full-time / Contract
Role Summary
Lead the architecture, design, and delivery of the AIOps platform across observability, automation, self ‘healing, event intelligence, and AI driven operations. Guide engineering teams, drive reliability and SLO strategies, and architect scalable Multi cloud AIOps solutions for enterprise systems.
Key Responsibilities
Define AIOps architecture
roadmap covering telemetry, analytics, automation, and AI/ML adoption.
Architect scalable observability platforms (Open Telemetry, Prometheus/Grafana, ELK/Loki, Jaeger/Tempo).
Lead design and implementation of event correlation, anomaly detection, RCA accelerators, and noise reduction.
Architect auto remediation workflows, Cha tops automations, and integration with ITSM (ServiceNow/JSM).
Integrate with APM tools (Datadog, Dynatrace, Splunk, New Relic, AppDynamics) to build unified AIOps pipelines.
Drive SLO/SLI frameworks, error budgets, and reliability engineering across services.
Own Multi cloud architecture (AWS/Azure/GCP), Kubernetes platform patterns, and IaC standards.
Ensure security, compliance, IAM, data governance, and high availability architecture for AIOps components.
Mentor engineers, review designs, lead incident reviews, and ensure platform scalability and cost efficiency.
Required Skills
AIOps
Observability
Deep expertise in Open Telemetry, distributed tracing, metrics/logs pipelines.
Strong understanding of AIOps signals: anomaly detection, pattern mining, event correlation.
Architecture
Distributed systems, microservices, API design, high availability, and performance engineering.
Experience designing real `time streaming pipelines using Kafka/Kinesis/Event Hub.
APM
Monitoring Tools
Hands on with Datadog / Dynatrace / Splunk / New Relic / AppDynamics / Moogsoft / BigPanda.
Automation
Selfa `Healing
Python/Go automation, serverless, runbooks, workflow engines (Airflow/Temporal), ChatOps bots.
Cloud
Platform
AWS/Azure/GCP architecture, Kubernetes (EKS/AKS/GKE), Terraform, GitOps, CI/CD.
Security
Governance
Strong grounding in RBAC, IAM/KMS, encryption, auditability, compliance (SOC2/ISO).
Experience
10a €15 years in engineering with 5+ years in SRE/Platform/Observability/AIOps architecture.
Proven track record designing
delivering enterprise scale AIOps or Observability platforms.
*** is an Equal Employment Opportunity employer. We promote and support a diverse workforce at all levels of the company. All qualified applicants will receive consideration for employment without regard to race, religion, color, sex, age, national origin, or disability. All applicants will be evaluated solely on the basis of their ability, competence, and performance of the essential functions of their positions with or without reasonable accommodations. Reasonable accommodation also are available in the hiring process for applicants with disabilities. Candidates can request a reasonable accommodation by contacting the company ADA Coordinator at
***
.

About the Company

Y

Yantran LLC