CAE HPC System Administrator

iFlow Inc

Saline, MI

JOB DETAILS
SALARY
$60–$65
SKILLS
Administrative Skills, Application Integration, Automation, Awk, Bash Scripting, Capacity Management, Change Management, Computer Aided Engineering (CAE), Computer Storage Hardware, Configuration Management, Continuous Improvement, Csh (C Shell) Scripting, Ethernet, Hardware Administration, IBM Product Family, Identify Issues, Ksh (Korn Shell) Scripting, Licensing, Linux Administration, Linux Operating System, Machine Tool, Operating Systems, Operations Processes, PHP Scripting Language (PHP Hypertext Preprocessor), Performance Analysis, Perl Programming Language, Problem Solving Skills, Production Systems, Red Hat Linux Operating System, Reporting Dashboards, Reporting Skills, Resource Management, Root Cause Analysis, Scripting (Scripting Languages), Server Hardware, Simulation, Software Administration, Software License Management, Software Patches, Standard Operating Procedures (SOP), Systems Administration/Management, Systems Maintenance, Systems Scalability
LOCATION
Saline, MI
POSTED
2 days ago
Job Title: CAE HPC System Administrator
Location: ENGINEERING DIVISION Saline, Michigan, United States
Duration:6 Months
Experience:10-15 Years

Description
We are seeking a highly skilled Computer Aided Engineering High Performance Computing (CAE HPC) System Administrator to manage, optimize, and support enterprise-level High-Performance Computing (HPC) environments dedicated to Computer-Aided Engineering (CAE) workloads.
This role is responsible for ensuring system stability, scalability, and performance of HPC clusters while supporting CAE applications, job scheduling systems, and underlying Linux infrastructure. The ideal candidate combines strong Linux systems expertise, HPC workload management experience, and a solid understanding of CAE engineering environments.

Key Responsibilities:
1. HPC Job Queuing & Workload Management
Administer, configure, and optimize HPC job scheduling environments, including IBM Spectrum LSF, Open PBS, or equivalent schedulers.
Design and tune job queues, resource allocation policies, and scheduling strategies to support diverse CAE workloads.
Monitor system performance and utilization trends and implement improvements to maximize efficiency and throughput.
2. CAE Application and Licensing Support
Install, upgrade, test, and support CAE applications and simulation tools in production environments.
Provide integration support between CAE applications and HPC scheduling systems.
Manage CAE software licensing systems (e.g., FlexLM, RLM) and ensure availability.
Troubleshoot application-related issues and ensure minimal disruption to engineering activities.
3. Linux Systems Administration & Automation
Administer and maintain Red Hat Enterprise Linux (RHEL) environments across HPC clusters.
Perform OS provisioning, deployment, and patch management using automated tools (e.g., PXE, or configuration management solutions).
Develop and maintain scripts (Bash, Korn shell, C Shell, Perl, Awk, or equivalent) to automate system monitoring, health checks, and routine administrative tasks.
o Maintain system logs, monitoring processes, and standard operating procedures.
1. Hardware & Infrastructure Management
Troubleshoot and resolve issues related to servers, storage systems, and high-performance networking (e.g., InfiniBand, high-speed Ethernet).
Support hardware lifecycle activities including installation, maintenance, and upgrades.
Conduct capacity planning based on system utilization trends and future demand.
2. Operations, Monitoring & Continuous Improvement
Perform system health checks, monitoring, and incident tracking for HPC and CAE environments.
Document system configurations, procedures, incidents, and best practices.
Track outages, analyze root causes, and implement preventive measures.
Follow change management processes for system updates and deployments.
Provide accurate reporting (e.g., utilization, incidents, system performance) and support project initiatives.

Requirements
3+ years of Linux system administration experience (preferably RHEL environments).
Hands-on experience managing HPC clusters and job schedulers (LSF, Slurm, PBS, or similar).
Proven experience in CAE application support and integration.
Strong scripting skills (Bash, Shell, Perl, or equivalent).
Experience with OS deployment, patching, and system automation.
Solid understanding of enterprise server hardware, storage, and networking fundamentals.
Experience with CAE tools such as Ansys, LS-DYNA, Nastran, or similar.
Familiarity with high-performance networking technologies is plus (e.g., InfiniBand).
Experience developing internal tools or dashboards are plus (e.g., PHP or web-based tooling).
Position Type / Expected Hours
Hybrid Full-time: Standard business hours with flexibility required to support maintenance windows and critical production issues.
Occasional after-hours or weekend work may be required based on business needs.

About the Company

i

iFlow Inc