HPC Systems Engineer

Science Applications International Corp

Charlottesville, VA

JOB DETAILS
SKILLS
Administrative Skills, Ansible, Automation, Bash Scripting, CUDA (Compute Unified Device Architecture), Command Line, Computer Operations, Computer Servers, Computer Systems, Configuration Management, Data Science, Distributed Computing, DoD Directive 8140, DoD Directive 8570, Docker, Emerging Technology, Enterprise Computing, File Systems, Fortune 500 Customers, GPU (Graphics Processing Unit), High Throughput, IAT - Information Assurance Technical, Identify Issues, Information Technology & Information Systems, Integrated Circuits (ICs), Linux Administration, Linux Operating System, MPI, Machine Tool, Network Architecture/Engineering, On Site Support, OpenMP, Parallel Computing, Performance Analysis, Professional Services, Puppet (Configuration Management), Python Programming/Scripting Language, Research Laboratory, Schedule Development, Scripting (Scripting Languages), Sensitive Compartmented Information (SCI), Simulation, Software Engineering, Systems Administration/Management, Systems Engineering, Technical Support, Testing, Top Secret Clearance, United States Department of Defense (DoD), Work From Home
LOCATION
Charlottesville, VA
POSTED
30+ days ago

SAIC - HPC Systems Engineer in Charlottesville, Virginia, United States

Join our Talent Network >

  • Talent Network

  • Alumni Connect

  • Current Employees

  • Events

  • Featured Locations

  • Chantilly

  • Charleston

  • Colorado Springs

  • Crane

  • El Segundo

  • Fort Meade

  • Hampton Roads

  • Hawaii

  • Huntsville

  • Reston

  • San Diego

  • Southern Maryland

  • St. Louis

  • Washington DC

  • View All Locations

  • Career Fields

  • Business Management

  • Cyber

  • Cloud

  • Data Science

  • DevSecOps

  • Electrical Engineering

  • Mechanical Engineering

  • Network Engineering

  • Software Engineering

  • Systems Engineering

  • Technical Support

  • View All Career Fields

  • View All Jobs

  • Candidate Resources

  • Chantilly

  • Charleston

  • Colorado Springs

  • Crane

  • El Segundo

  • Fort Meade

  • Hampton Roads

  • Hawaii

  • Huntsville

  • Reston

  • San Diego

  • Southern Maryland

  • St. Louis

  • Washington DC

  • View All Locations

  • Business Management

  • Cyber

  • Cloud

  • Data Science

  • DevSecOps

  • Electrical Engineering

  • Mechanical Engineering

  • Network Engineering

  • Software Engineering

  • Systems Engineering

  • Technical Support

  • View All Career Fields

Start Application >>

Back to Search Results >

Previous Opportunity > Next Opportunity >

HPC Systems Engineer

Job ID: 2610670

Location: Charlottesville, VA, United States

Date Posted: Mar 26, 2026

Category: Engineering and Sciences

Subcategory: Systems Engineer

Schedule: Full-Time

Shift: Day Job

Travel: No

Minimum Clearance Required: Top_Secret

Clearance Level Must Be Able to Obtain: TS/SCI

Potential for Remote Work: On-Site

Benefits: Click here

Share: mail

Apply Now >

Apply Now >

Job Description

Description

SAIC is looking for a highly qualified HPC Systems Engineer to support the Army's Golden Dome initiative. The engineer will support the deployment and sustainment of Linux-based High Performance Computing (HPC) cluster environments used for distributed compute workloads, simulation environments, and GPU-enabled processing.

The environment will include:

  • multi-node Linux compute clusters
  • workload scheduling platforms such as Slurm or PBS
  • cluster provisioning frameworks (e.g., xCAT, Warewulf)
  • high-performance networking technologies including RDMA / InfiniBand
  • distributed parallel compute workloads utilizing MPI or OpenMP
  • GPU-enabled compute resources supporting CUDA-based processing

The system will be used to support scientific computing, simulation workloads, and other distributed compute operations within a secure research environment.

Candidates should be comfortable working within cluster-scale computing environments where performance, scheduler configuration, and distributed workload execution are critical operational factors.

The HPC Systems Engineer will support the build-out, configuration, and sustainment of HPC cluster platforms.

The role focuses on:

  • cluster platform configuration
  • scheduler administration
  • distributed compute troubleshooting
  • performance analysis across compute, storage, and network layers
  • GPU compute workload support
  • automation and operational tooling

Candidates should have experience working with multi-node Linux cluster environments and distributed compute workloads.

Core Technical Capabilities

Candidates should demonstrate capability in most of the following areas.

HPC Cluster Platforms

Experience supporting multi-node Linux compute clusters, including node integration, configuration, and operational sustainment.

Experience with cluster provisioning tools such as xCAT, Warewulf, or similar node deployment systems is beneficial.

Workload Scheduling Platforms

Experience supporting distributed compute workloads using schedulers such as:

  • Slurm
  • PBS / PBS Pro
  • Torque
  • Grid Engine

Candidates should understand queue configuration, job submission workflows, and scheduler troubleshooting.

Candidates should understand how workload schedulers interact with distributed compute workloads and containerized execution environments.

Linux Systems Administration

Strong Linux administration experience including:

  • command-line system administration
  • server and compute node configuration
  • system troubleshooting in distributed compute environments

Experience with RHEL-based environments is preferred.

Distributed and Containerized Workloads

Experience supporting distributed compute workloads utilizing parallel computing frameworks such as:

  • MPI
  • OpenMP
  • GPU compute frameworks

Candidates should understand how workload schedulers interact with distributed compute workloads and containerized execution environments within HPC clusters.

Familiarity with container technologies commonly used in HPC environments such as:

  • Docker
  • Podman
  • Singularity / Apptainer

Candidates should understand how containerized workloads interact with schedulers, GPU resources, and distributed compute environments.

Experience supporting containerized HPC workloads or integrating container platforms with cluster infrastructure is desirable.

HPC Networking

Familiarity with high-performance networking technologies including:

  • RDMA networking
  • InfiniBand
  • high-throughput cluster networking architectures

Candidates should be comfortable assisting with troubleshooting cluster communication or performance issues.

GPU Compute Environments

Experience supporting GPU-enabled compute environments and workloads utilizing CUDA frameworks is desirable.

Automation and Operational Tooling

Experience writing scripts or operational tooling using languages such as:

  • Bash
  • Python

Automation experience supporting system administration or cluster operations is beneficial.

Qualifications

Candidates must meet the following requirements:

  • Bachelor degree in science/technology; 10 additional YoE can be substituted for degree
  • 8+ years of experience is required
  • Minimum 6 years of experience administering Linux systems in enterprise, research computing, or distributed compute environments
  • An Active Top Secret clearance is required; an active TS/SCI clearance must be obtained prior to beginning work.
  • 100% onsite support in Charlottesville, VA
  • Experience supporting distributed compute environments or HPC cluster platforms
  • Experience working with workload schedulers such as Slurm, PBS, Torque, or similar systems
  • Experience administering Linux systems through command-line interfaces
  • Experience with scripting or automation tools (Bash, Python, or similar)
  • Ability to obtain required DoD 8140 (8570) IAT Level II certification
  • Candidates must have direct experience with HPC or distributed compute environments.

Candidates with the following experience are strongly preferred:

  • Administration of multi-node HPC cluster environments
  • Experience with parallel or distributed file systems such as Lustre, BeeGFS, or GPFS
  • Experience supporting GPU-enabled compute environments and CUDA workloads
  • Experience with configuration management tools such as Ansible or Puppet
  • Experience supporting research, laboratory, or mission computing environments
  • Experience supporting systems within DoD/DoW or IC environments

Overview

SAIC accepts applications on an ongoing basis and there is no deadline.

SAIC is a premier Fortune 500 mission integrator focused on advancing the power of technology and innovation to serve and protect our world. Our robust portfolio of offerings across the defense, space, civilian and intelligence markets includes secure high-end solutions in mission IT, enterprise IT, engineering services and professional services. We integrate emerging technology, rapidly and securely, into mission critical operations that modernize and enable critical national imperatives.

We are approximately 24,000 strong; driven by mission, united by purpose, and inspired by opportunities. SAIC is an Equal Opportunity Employer. Headquartered in Reston, Virginia, SAIC has annual revenues of approximately $7.5 billion. For more information, visit saic.com. For ongoing news, please visit our newsroom.

Share: mail

Apply Now >

Similar Jobs

HPC Support Engineer

Charlottesville, VA, United States

Engineering and Sciences

HPC Systems Engineer

Charlottesville, VA, United States

Engineering and Sciences

Are you an SAIC Employee?

Please apply through the internal career site here >

About the Company

S

Science Applications International Corp

SAIC is a premier Fortune 500® technology integrator driving our nation's digital transformation. Our robust portfolio of offerings across the defense, space, civilian, and intelligence markets includes secure high-end solutions in engineering, IT modernization, and mission solutions. Using our expertise and understanding of existing and emerging technologies, we integrate the best components from our own portfolio and our partner ecosystem to deliver innovative, effective, and efficient solutions that are critical to achieving our customers' missions. We are a team of 26,000 strong driven by mission, united purpose, and inspired by opportunity. Headquartered in Reston, Virginia, SAIC has annual revenues of approximately $7.1 billion. For more information, visit saic.com.
COMPANY SIZE
10,000 employees or more
INDUSTRY
Computer/IT Services
FOUNDED
2013
WEBSITE
https://jobs.saic.com/