Sr. Software Engineer | Kubernetes | GPU Orchestration - REMOTE

Living Talent

New York, NY(remote)

JOB DETAILS
SALARY
SKILLS
Architectural Design, Bash Scripting, Best Practices, Cloud Computing, Communication Skills, Computer Science, Cost Control, Engineering, GPU (Graphics Processing Unit), High Availability, Hybrid Cloud, Middleware Architecture, Open Source, Process Improvement, Python Programming/Scripting Language, Software Engineering, Startup
LOCATION
New York, NY
POSTED
30+ days ago

GPU Orchestration

  • Startup
  • Company size: 30
  • Remote within North America
  • Compensation: Base Salary 250k + Equity

Key Responsibilities

  • Lead Design, Architecture & Development of K8s-based cloud infrastructure.
  • Use K8s Controllers, Operators & CRs to Implement scalable, high-availability solutions.
  • Integrate Karpenter, and/or other advanced tools for infrastructure optimization.
  • Architect MLOps Middleware integration (dynamic workload migration, resource disaggregation).
  • Build monitoring, logging & alerting systems.
  • Drive infrastructure cost optimization through FinOps best practices in K8s deployments.
  • Promote K8s best practices & mentor software engineers.
  • Collaborate across teams to drive K8s adoption in multi-cloud and hybrid environments.
  • Open-Source Contributions in the Kubernetes community.

Qualifications

Kubernetes Expertise
  • Designing, deploying, and managing K8s clusters (AKS, EKS, GKE, OpenStack, etc.).
  • Hands-on experience with K8s core components (Karpenter, cluster autoscaler, CNI, CSI, CRI, CRD, operators).
  • 5+ years in Kubernetes infrastructure.
  • Contributing to open-source Kubernetes projects.
  • 10+ years: software engineering experience.
  • Go, Python, Bash, etc. (one or more).
  • Excellent communication skills for both technical and non-technical stakeholders.
  • Bachelor’s or Master’s degree in Computer Science or related field (preferred).

Preferred Experience

  • GPU scheduling, container orchestration, HPC (high-performance computing) workloads.
  • Multi-cloud & hybrid cloud deployments familiarity.
  • MLOps platforms experience (Kubeflow, TFX, etc.).
  • FinOps practices & cloud cost management experience/knowledge

About the Company

L

Living Talent