AI Platform Engineer - TS/SCI with CI Poly

PGTEK

McLean, VA

JOB DETAILS
SALARY
$140,000–$185,000
SKILLS
Access Control, Amazon Web Services (AWS), Analysis Skills, Artificial Intelligence (AI), Automation, Bash Scripting, Best Practices, Centralized Operations/Management, Cloud Computing, Communication Skills, Consulting, Continuous Deployment/Delivery, Continuous Improvement, Continuous Integration, Cross-Functional, DNS (Domain Name System), DevOps, Docker, Git, GitHub, Go Programming Language (Golang), Government, HTTP (HyperText Transport Protocol), High Availability, Identify Issues, Jenkins, Leadership, Linux Administration, Load Balancing, Machine Tool, Maintain Compliance, Mentoring, Microsoft Windows Azure, Multiplatform/Cross-Platform, Network Configuration Management, Network Operations Center, Network Security, Network Support, Operational Audit, Operations Security (OPSEC), Performance Analysis, Performance Management, Presentation/Verbal Skills, Problem Solving Skills, Process Improvement, Python Programming/Scripting Language, Quality Assurance Methodology, Reliability Engineering, Resource Management, Resource Utilization, Scripting (Scripting Languages), Security Infrastructure, Sensitive Compartmented Information (SCI), Software Development, Source Code/Configuration Management (SCM), Splunk, TCP/IP (Transmission Control Protocol/Internet Protocol), Team Player, Technical Leadership, Top Secret Clearance, Unit Test, Unix System Administration, Vulnerability Scanners, Web Client Plug-ins, Willing to Travel, Writing Skills
LOCATION
McLean, VA
POSTED
2 days ago

AI Platform Engineer

Location: McLean, VA (Onsite 5 Days per Week)
Employment Type: Full-Time
Salary Range: $125,000 - $185,000
Clearance Required: Active TS/SCI with Counterintelligence Polygraph
Certification Requirement: Current IAM Level II certification meeting DoD 8570 IAT requirements

Position Overview

We are seeking an experienced AI Platform Engineer to play a critical role in building, maintaining, securing, and optimizing the infrastructure that supports advanced Artificial Intelligence (AI) workloads. This individual will be responsible for designing and managing scalable Kubernetes environments, implementing automated deployment pipelines, and ensuring platform reliability, security, and performance.

The ideal candidate combines deep expertise in cloud-native technologies, Kubernetes administration, DevOps practices, and automation with strong problem-solving and collaboration skills. This role will work closely with engineering, operations, and security teams to deliver highly available AI platform solutions in a mission-critical environment.

Key Responsibilities

Kubernetes & Platform Engineering

  • Design, deploy, secure, maintain, and upgrade highly available Kubernetes clusters across cloud and on-premises environments.
  • Manage Kubernetes control plane components, worker nodes, and supporting infrastructure.
  • Implement and maintain containerized workloads using Docker and Kubernetes best practices.
  • Configure and manage Kubernetes resources including Pods, Deployments, StatefulSets, Services, Ingress, ConfigMaps, Secrets, Persistent Volumes, and Namespaces.
  • Support advanced networking configurations, including CNI plugins, network policies, service meshes, and DNS services.

Security & Compliance

  • Implement security best practices across Kubernetes environments.
  • Manage RBAC, admission controllers, vulnerability scanning, secret management, and network security controls.
  • Ensure platform compliance with government and organizational security requirements.
  • Support secure deployment practices and infrastructure hardening initiatives.

DevOps & Automation

  • Design, implement, and maintain CI/CD pipelines for containerized applications.
  • Utilize GitOps methodologies and tools to automate application deployment and platform management.
  • Develop infrastructure as code (IaC) solutions using Terraform, Pulumi, CloudFormation, or similar tools.
  • Create automation scripts and tooling using Python, Go, Bash, or related languages.

Monitoring, Performance & Reliability

  • Implement monitoring, logging, alerting, and observability solutions across platform environments.
  • Diagnose and resolve complex performance issues affecting Kubernetes clusters and applications.
  • Optimize resource utilization and platform scalability.
  • Support distributed tracing, centralized logging, and operational analytics initiatives.
  • Apply DevOps and Site Reliability Engineering (SRE) principles to improve platform resilience and operational excellence.

Collaboration & Leadership

  • Collaborate with development, operations, security, and infrastructure teams.
  • Lead technical initiatives and mentor junior engineers.
  • Drive continuous improvement efforts across platform engineering and deployment practices.
  • Communicate effectively with technical and non-technical stakeholders.

Requirements

  • Extensive experience designing, deploying, and managing Kubernetes environments (EKS, AKS, GKE, OpenShift, or self-managed clusters).
  • Advanced knowledge of Docker and containerization technologies.
  • Strong understanding of Kubernetes networking, service meshes, and cluster architecture.
  • Expertise in Kubernetes security, access controls, and secret management.
  • Experience with CI/CD platforms such as Jenkins, GitLab CI/CD, GitHub Actions, Tekton, Argo Workflows, or similar.
  • Proficiency with Infrastructure as Code tools including Terraform, Pulumi, or CloudFormation.
  • Strong scripting and automation experience using Python, Go, Bash, or similar languages.
  • Experience with GitOps tools such as Argo CD.
  • Hands-on experience with monitoring and observability platforms including Prometheus, Grafana, ELK/OpenSearch, Datadog, or Splunk.
  • Strong Linux/Unix administration background.
  • Solid understanding of networking concepts including TCP/IP, DNS, HTTP, and load balancing.
  • Expert-level Git and version control experience.

Professional Skills

  • Exceptional troubleshooting and analytical problem-solving abilities.
  • Strong verbal and written communication skills.
  • Ability to work effectively in cross-functional teams.
  • Experience mentoring engineers and leading technical efforts.
  • Strong sense of ownership and accountability.
  • Adaptability and commitment to continuous learning.

Preferred Qualifications

  • Certified Kubernetes Administrator (CKA)
  • Certified Kubernetes Application Developer (CKAD)
  • Certified Kubernetes Security Specialist (CKS)
  • AWS Certified DevOps Engineer
  • Azure DevOps Engineer Expert
  • Experience developing Kubernetes Operators and Custom Resource Definitions (CRDs)
  • Experience building Internal Developer Platforms (IDPs)
  • Familiarity with testing methodologies including unit, integration, and end-to-end testing

Travel Requirements

  • Up to 20% travel as required for on-site installations, maintenance, and troubleshooting activities at customer locations or data centers.

Benefits

Our comprehensive benefits package for full-time salaried employees is effective immediately upon the start date. Benefits include comprehensive PPO medical coverage with access to a Health Savings Account (HSA) option, a vision plan, and dental insurance with the base dental plan option paid for by PGTEK. Life Insurance, Short and Long-Term disability, and Critical Illness insurance have premiums covered.  Additionally, PGTEK offers a matching 401(k) plan and a discount on pet insurance through ASPCA Pet Insurance.   An Employee Assistance Program is available at no cost to all employees.  PGTEK offers a generous amount of PTO and Holidays, and an Education Assistance Program is available after 12 months of employment.

ABOUT PGTEK

PGTEK is a true consulting organization dedicated to helping clients achieve their business and technology objectives utilizing our decades of experience and business relationships. PGTEK invests in the educational advancements of our staff by providing the necessary resources to complete Professional and Business Certifications. Our company is our people, and we treat them like family.

EOE, including disability/veterans

About the Company

P

PGTEK

Ronald Podmilsak founded PGTEK in 2000 to provide IT professional services and geospatial analysis to the U.S. Intelligence Community and the Department of Defense. That same year, his son Scott Podmilsak founded Strategic Business Systems, Inc. (SBS), an IT Infrastructure services firm. Scott Podmilsak and his management team had tremendous success and in 2008 he sold SBS to Brocade Communications (NASDAQ: BRCD). In 2010 Scott Podmilsak joined PGTEK as the Chief Executive Officer and board member. Upon joining, Scott reassembled the relevant executive team that he had created at SBS. Together again, this seasoned team of executives have been applying the lessons learned in technology, operations, customer service, and employee morale to create a firm that greatly surpasses their last endeavor. Today, PGTEK is an industry leader in geospatial and datacenter technologies servicing both the commercial and federal market segments with innovative solutions

COMPANY SIZE
100 to 499 employees
INDUSTRY
Computer/IT Services
FOUNDED
2000
WEBSITE
http://www.pgtek.com