HPC / AI Software Infrastructure Lead E

KLATencor Corporation

Ann Arbor, MI

JOB DETAILS
SKILLS
Algorithms, Artificial Intelligence (AI), Best Practices, C++ Programming Language, CUDA (Compute Unified Device Architecture), Cloud Computing, Coaching, Computer Programming, Computer Skills, Computer Vision, Continuous Deployment/Delivery, Continuous Integration, Cross-Functional, Data Management, Data Science, Deep Learning, DevOps, Distributed Computing, Ecosystems, Electronics, Flat Panel Displays, GPU (Graphics Processing Unit), Image Processing, Integrated Circuit Packaging, Laptop PC, Leadership, Linux System Internals/Programming, Machine Learning, Mentoring, Parallel Computing, Performance Tuning/Optimization, Physics, Printed Circuit Board (PCB), Problem Solving Skills, Production Systems, Python Programming/Scripting Language, Research & Development (R&D), Sales, Semiconductor Manufacturing, Semiconductors, Smartphones, Software Engineering, Systems Reliability, Talent Management, Technical Leadership, Wafer Manufacturing
LOCATION
Ann Arbor, MI
POSTED
1 day ago
Company OverviewKLA is a global leader in diversified electronics for the semiconductor manufacturing ecosystem. Virtually every electronic device in the world is produced using our technologies. No laptop, smartphone, wearable device, voice-controlled gadget, flexible screen, VR device or smart car would have made it into your hands without us. KLA invents systems and solutions for the manufacturing of wafers and reticles, integrated circuits, packaging, printed circuit boards and flat panel displays.  The innovative ideas and devices that are advancing humanity all begin with inspiration, research and development. KLA focuses more than average on innovation and we invest 15% of sales back into R&D. Our expert teams of physicists, engineers, data scientists and problem-solvers work together with the world's leading technology providers to accelerate the delivery of tomorrow's electronic devices. Life here is exciting and our teams thrive on tackling really hard problems. There is never a dull moment with us.
Job Description/Preferred Qualifications
HPC/AI Software Infrastructure Leads are core to KLA's technology, while we do not currently have an opening, we are always building our HPC/AI Software Infrastructure Lead Engineering talent community, we are interested in learning about your background.
Apply to this posting for Future Opportunities with KLA.
At KLA, we're pushing the boundaries of semiconductor inspection through advanced AI and high-performance computing. We are looking for a hands-on technical leader to architect and scale the next generation of AI/HPC infrastructure powering our most critical imaging and data platforms. This role is ideal for someone who thrives at the intersection of distributed systems, GPU computing, and real-world AI workloads, and who enjoys building and mentoring high-performing engineering teams while driving technical excellence.
What You'll Do
Lead the architecture and development of large-scale HPC and AI infrastructure supporting cutting-edge image processing and machine learning workloads
Design scalable, high-performance distributed systems that unify traditional image processing with modern AI/Deep Learning pipelines
Drive GPU-accelerated computing strategies, optimizing performance across compute, storage, and networking layers
Partner cross-functionally with hardware, algorithms, and product teams to deliver robust, production-ready platforms
Establish engineering best practices (code quality, CI/CD, observability, performance tuning) for mission-critical systems
Mentor and develop engineers, providing technical guidance, coaching, and growth opportunities for junior team members
Serve as a technical leader and decision-maker, influencing architecture and long-term platform strategy
What You Bring
Experience
10+ years in software engineering, including leading and scaling technical teams
Proven success building distributed systems in HPC, AI/ML, or cloud-native environments
Track record of delivering performance-critical infrastructure at scale
Experience mentoring and growing early- and mid-career engineers
Technical Expertise
Deep understanding of distributed systems, parallel computing, and Linux systems programming
Strong programming skills in C++, Python, or similar systems-level languages
Experience with GPU computing (CUDA, ROCm) and modern AI frameworks (PyTorch, TensorFlow, etc.)
Familiarity with high-performance storage systems, networking, and data pipelines
Strong foundation in CI/CD, DevOps, and production system reliability
Bonus Experience
Background in image processing, computer vision, or scientific computing
Experience supporting hybrid HPC + AI workloads in production environments
Leadership & Impact
Passion for developing talent and building inclusive, high-performing teams
Ability to operat

About the Company

K

KLATencor Corporation