Sr. DevOps Engineer - AI and Site Reliability Engineering

Teradata Corp

California, CA

JOB DETAILS
SKILLS
Amazon Web Services (AWS), Ansible, Artificial Intelligence (AI), Atlassian JIRA, Automation, Bug Tracking/Defect Management, Cloud Computing, Communication Skills, Computer Science, Configuration Management, Customer Experience, DevOps, Enterprise Applications, Enterprise Protection, Establish Priorities, Git, High Tech Industry, Identify Issues, Jenkins, Linux Administration, Machine Learning, Machine Tool, Microsoft Windows Azure, Modeling Languages, NCR Teradata, NoSQL, Operations Planning, Product Engineering, Production Systems, Programming Languages, Puppet (Configuration Management), Python Programming/Scripting Language, Regulatory Compliance, Reliability Engineering, Risk Management, Root Cause Analysis, SQL Databases, Software Engineering, Software Upgrades, Source Code/Configuration Management (SCM), Strategic Planning, Systems Reliability, Team Lead/Manager, Testing, Use Cases
LOCATION
California, CA
POSTED
1 day ago

Our company

At Teradata, we believe that people thrive when empowered with better information. Teradata Autonomous Knowledge Platform activates enterprise intelligence by unifying data, knowledge and business context to achieve tangible outcomes. With Teradata, organizations can provide agents with full context for impact when it matters. Our solution lets businesses connect and scale on premises, in the cloud, or through a hybrid approach. Teradata delivers real business value with AI.

What You'll Do

  • Working on a team of professionals, you will design, implement, test, deploy, administer, and continually improve software solutions to ensure system reliability and availability, mitigate operational risks, track system health, and improve mean-time-to-discover and mean-time-to-respond for operational issues.
  • You will help lead chaos engineering efforts in a production-alike environment, exposing systems to simulations of real-world turbulence with the objective of identifying and quantifying operational weaknesses and developing remediation strategies.
  • You will leverage modern AI technologies, including large language models, machine learning, and agentic systems, both to increase the operational efficiency of the team and to measure and improve the reliability, scalability, observability, supportability, and performance of Teradata software.
  • You will become a subject-matter expert in the production deployment and upgrade of Teradata software and the full software stack, from the network layer all the way to the observability tooling, that it relies on.

Who You'll Work With

  • You'll work on a globally-distributed team of other devops professionals, with engineers focused on site reliability engineering and observability.
  • You'll work closely with product engineering and cloud operations personnel to understand operational requirements and identify and remediate operational deficits.
  • You'll work with security and compliance teams to help provide evidence necessary to meet Teradata's compliance obligations.
  • You'll report to a Sr. Manager, Site Reliability Engineering.

What Makes You A Qualified Candidate

  • Bachelor's degree or equivalent in computer science or a related field, master's degree or equivalent preferred.
  • 4+ years of industry experience.
  • Experience with at least one major cloud service provider (AWS, Azure, and/or Google Cloud), preferably all three. CSP developer or architect certifications preferred.
  • Experience building and deploying complex software solutions to significant operational problems. Proficiency with at least one modern programming language such as Python, and with a modern source control tool, preferably Git.
  • Familiarity with machine learning libraries such as Tensorflow and Scikit-Learn.
  • Experience building and deploying AI systems via cloud-based generative AI and agentic AI platforms such as AWS Bedrock, AWS Sagemaker, Azure AI Foundry, Google Vertex AI, and Google AgentSpace.
  • Experience with at least one modern defect tracking tool, preferably Jira.
  • Experience with an infrastructure-as-code (IaC) cloud provisioning tool, preferably Terraform, and with a configuration management tool such as Ansible or Puppet.
  • Experience with Grafana or an equivalent observability tool.
  • Experience with a build/deployment automation tool such as Jenkins or Bamboo.
  • Familiarity with both SQL and noSQL databases, and use cases for each.
  • Experience administering Linux-based systems.

What You'll Bring

  • 4+ years of experience in the software industry in a devops or site reliability engineering role.
  • A passion for constant, iterative improvement over the status quo.
  • An in-depth understanding of site reliability engineering principles, and how to measure and improve the reliability, scalability, supportability, and observability of production-deployed enterprise software, with a focus on real-world operational and customer experience.
  • An understanding of enterprise software deployment and security/compliance principles.
  • Proficiency with multi-layered technical troubleshooting and root-cause analysis.
  • The ability to quickly and comprehensively decompose a problem, identifying dependencies and defining tasks, and to think creatively and holistically about solutions.
  • The ability to work both independently and collaboratively in a fast-paced environment, and adjust as priorities change.
  • The ability to communicate concisely but effectively with colleagues, leaders, and stakeholders, and tailor communications to the needs and understanding of a particular audience.
  • The flexibility to work on a globally-distributed team managed from the United States.

Why We Think You'll Love Teradata

We prioritize a people-first culture because we know our people are at the very heart of our success. We embrace a flexible work model because we trust our people to make decisions about how, when, and where they work. We focus on well-being because we care about our people and their ability to thrive both personally and professionally. We are committed to actively working to foster an inclusive environment that celebrates people for all of who they are.

#LI-AC1

About the Company

T

Teradata Corp