Kubernetes Networking Platform Engineer

MROADS

Bellona, NY

JOB DETAILS
JOB TYPE
Full-time
SKILLS
Amazon Web Services (AWS), Application Programming Interface (API), Architectural Services, Artificial Intelligence (AI), Automation, Best Practices, Budgeting, Channel Strategies, Cloud Computing, Code Reviews, Communication Skills, Computer Science, Continuous Improvement, Cross-Functional, Customer/Client Research, DNS (Domain Name System), Data Recovery, DevOps, Disaster Recovery, Distributed Computing, Documentation, Establish Priorities, Failover, GCP (Good Clinical Practices), High Availability, Identify Issues, Java, Leadership, Load Balancing, Machine Tool, Mentoring, Metrics, Microsoft Windows Azure, Network Administration/Management, Network Integration, Network Performance/Analysis, Network Routing, Network Security, On Call, Order Delivery, Programming Languages, Programming Tools, Public Cloud, Python Programming/Scripting Language, Reliability Engineering, Root Cause Analysis, Security Infrastructure, Software Engineering, Standards Development, System Operations, Systems Administration/Management, Systems Engineering, Technical Leadership, Time Management, Traffic Shaping, Trend Analysis, Usability Engineering, User Documentation, Vendor/Supplier Evaluation, nginx Web Server
LOCATION
Bellona, NY
POSTED
30+ days ago
DescriptionJob Description The Kubernetes Networking Platform Senior Engineer will lead the design, delivery, and operation of networking capabilities across the enterprise Kubernetes platform. This includes critical components such as ingress controllers, service mesh, DNS, and traffic management. This engineer will join a team responsible for building a secure, scalable, and observable networking layer that enables application teams to seamlessly connect, communicate, and expose services within and outside the cluster. The ideal candidate brings deep experience in distributed systems and networking, and is passionate about building platform abstractions that simplify complexity for developers while maintaining enterprise-grade reliability and security. We are transforming the way technology is managed. Automation, DevOps, and product-oriented platform engineering are the new standards as we enable rapid innovation, speed-to-market, and resilient operations. As we are early in this journey, strong technical leadership and the ability to influence and elevate others is critical. CANDIDATE PROFILE Required: • Undergraduate degree in an engineering or computer science discipline and/or equivalent experience/certification • 6+ years of technology experience, including:o 3+ years in a platform, infrastructure, or systems engineering roleo 3+ years working with public cloud platforms (AWS, Azure, GCP) • Strong experience with Kubernetes, including:o Networking fundamentals (CNI, service discovery, load balancing)o Kubernetes networking primitives (Services, Ingress, NetworkPolicy) • Hands-on experience with Kubernetes networking components, such as:o Gateway and Ingress controllers (e.g., kgateway, NGINX, ALB, or similar)o Service mesh technologies (e.g., Istio, Cilium, or similar)o DNS systems (CoreDNS, External DNS, or enterprise DNS integration) • Experience designing and operating highly available, distributed systems (99.99% uptime) with attention to latency, resiliency, and failure modes • Strong troubleshooting skills across layers (application, network, infrastructure) • Proven ability to implement Infrastructure as Code and automation using tools such as Terraform, Helm, and GitOps workflows • Mindset of \'automate first\', continuously identifying and eliminating manual processes • Experience working within a platform-as-a-product model, including:o Treating internal platform capabilities as productso Gathering feedback from users (application teams)o Iterating based on adoption and usability • Strong collaboration habits, including:o Code reviews as a primary mechanism for quality and knowledge sharingo Writing clear, user-focused documentationo Contributing to and evolving engineering standards across teams • Comfort using AI-powered development tools (e.g., coding assistants, copilots, or similar) to accelerate development, troubleshooting, and documentation • Ability to critically evaluate AI-generated output, ensuring correctness, security, and alignment with platform standards • Experience leveraging AI tools to:o Accelerate Infrastructure as Code developmento Troubleshoot complex system and networking issueso Improve documentation and developer experience • Strong engineering judgment to determine when to rely on AI vs. when to deep dive manually, especially in complex distributed systems and production incidents Preferred: • Experience with one or more high-level programming languages (Go, Python, Java, or similar) • Deep understanding of:o Layer 4 / Layer 7 networking conceptso Traffic routing, load balancing strategies, and API gateway patternso mTLS, zero-trust networking, and secure service-to-service communication • Experience operating service mesh at scale (traffic shaping, retries, circuit breaking, observability) • Familiarity with cloud-native networking integrations:o AWS (ALB/NLB, VPC Lattice, Route53, PrivateLink)o Hybrid or multi-cluster networking patterns • Experience with observability tooling (metrics, logs, tracing) for network and service performance • Ability to influence platform adoption and drive best practices across application teams • Experience supporting or implementing self-healing, resilient infrastructure patterns CORE WORK ACTIVITIESKubernetes & Platform Engineering • Design, build, and operate Kubernetes networking capabilities, including ingress, service mesh, and DNS • Develop and maintain standardized, self-service networking patterns for application teams • Implement and manage traffic routing strategies, including canary deployments, blue/green releases, and failover mechanisms • Ensure secure communication through network policies, mTLS, and zero-trust principles • Continuously improve platform reliability, scalability, and performance through automation and observability • Troubleshoot complex networking issues across distributed systems and drive root cause analysis • Partner with security, platform, and application teams to define and enforce networking standards • Build tooling and automation to improve developer experience and reduce operational overhead • Maintain clear and consumable documentation for platform users • Stay current with emerging trends in Kubernetes and cloud-native networkingPlatform Operations, Reliability & Security• Serve in on call rotation• Support monitoring, logging, and observability integrations (Prometheus, Grafana, ELK, OpenTelemetry)• Implement and maintain platform level security controls, including OPA/Gatekeeper, secrets management (e.g., Vault), and IAM guardrails• Participate in disaster recovery, backup/restore operations, and upgrade cycles• Review issues, logs, and metrics to identify trends and propose improvements• Maintain clear and complete documentation for system configurations and operational proceduresCollaboration & Leadership• Participate in architectural discussions to help application teams make efficient platform decisions• Provide mentorship to junior engineers and contribute to peer reviews• Support interviewing and help foster a modern engineering culture• Collaborate with cross-functional teams, including software engineering, cloud operations, and securityManaging Priorities and Delivery• Contribute to planning, prioritization, and organization of engineering work to meet delivery timelines• Provide technical leadership for successful platform feature delivery• Assist with evaluating vendor solutions and tooling, providing recommendations to leadership• Communicate technical concepts clearly to stakeholders, both technical and non-technical• Understand business priorities and contribute to delivering against performance and budget goals• Perform other reasonable duties as assigned

About the Company

M

MROADS