Infrastructure admin for AI services (Azure & AWS)

Axelon Services Corporation

Aliso Viejo, CA(remote)

JOB DETAILS
SKILLS
ARM (Advanced RISC Machine), AWS Lambda, Amazon Elastic Compute Cloud (EC2), Amazon Simple Storage Service (S3), Amazon Web Services (AWS), Artificial Intelligence (AI), Bash Scripting, Best Practices, Cloud Computing, Computer Science, Continuous Deployment/Delivery, Continuous Integration, Cost Control, Cryptography, Data Science, Docker, GPU (Graphics Processing Unit), GitHub, High Availability, Information Technology & Information Systems, Linux Administration, Machine Learning, Maintain Compliance, Microsoft Windows Azure, Network Configuration Management, Network Security, Protective Services, Python Programming/Scripting Language, Regulatory Compliance, Resource Management, Resource Utilization, Scripting (Scripting Languages), Software Engineering, Software Patches, Team Player, Virtual Machine (VM), Windows PowerShell
LOCATION
Aliso Viejo, CA
POSTED
8 days ago


Job title: Infrastructure admin for AI services (Azure & AWS)
location: Remote
$50/hr


Key Responsibilities


  • Design, deploy, and manage cloud infrastructure supporting AI/ML workloads on AWS and Azure
  • Manage compute resources such as EC2, Azure Virtual Machines, GPU instances, EKS, VPC, ECS, S3, Lambda, Route 53, and Kubernetes clusters
  • Provision and configure storage, networking, and security services for AI platforms
  • Ensure high availability, scalability, and reliability of AI environments
  • Deploy and maintain AI/ML services such as Amazon SageMaker, Azure Microsoft Foundry, and Azure Machine Learning
  • Support data scientists and ML engineers by providing optimized infrastructure for model training and deployment
  • Implement Infrastructure as Code (IaC) using Terraform, CloudFormation, ARM templates / Bicep, and Docker Files
  • Automate and set up environment provisioning, patching, and scaling
  • Deploy and manage containerized AI workloads using Docker, Kubernetes, Amazon EKS, Azure Kubernetes Service (AKS), and ECS
  • Monitor system health, performance, and resource utilization using CloudWatch, Azure Monitor, Datadog / Prometheus
  • Optimize infrastructure for cost, performance, and GPU utilization
  • Implement cloud security best practices including IAM / RBAC management, network security groups, encryption, and secrets management
  • Ensure compliance with organizational and regulatory standards
  • Integrate AI infrastructure with CI/CD pipelines
  • Support automated deployment of models and AI services



Required Qualifications

  • Bachelors degree in Computer Science, Information Systems, or related field
  • 5+ years experience in infrastructure administration or cloud engineering
  • Strong hands-on experience with AWS cloud services and Microsoft Azure cloud services
  • Experience supporting AI/ML infrastructure or data platforms
  • Proficiency with Linux administration and scripting (Python, Bash, PowerShell, Terraform, Terragrunt)
  • Experience with Docker and Kubernetes
  • Experience with GitHub Actions
  • Experience with LLM infrastructure set up
  • Experience working in a centralized team with triaging capabilities

About the Company

A

Axelon Services Corporation