DevOps & Site Reliability Lead-Retail Devops

K&K Global Talent Solutions Inc.

Deerfield, IL

JOB DETAILS
SKILLS
Application Programming Interface (API), Applications Security, Automation, Best Practices, Budgeting, C++ Programming Language, Cloud Computing, Coaching, Continuous Deployment/Delivery, Continuous Integration, Corrective Action, Cost Control, Database Technology, DevOps, High Availability, IBM AS-400 Server, IBM DB2, Identity Data Management, Incident Response, Java, Leadership, Logistics, Mentoring, Merchandising, Metrics, Microsoft Windows Azure, MySQL, Object Oriented Programming (OOP), Operational Improvement, Operational Strategy, Oracle Database, Point of Sale (POS) Systems, Python Programming/Scripting Language, REST (Representational State Transfer), Reliability Engineering, Reporting Dashboards, Retail, Security Architecture, Service Level Agreement (SLA), Software Engineering, Spring Framework, Spring MVC, VMS Operating System, Virtual Machine (VM)
LOCATION
Deerfield, IL
POSTED
4 days ago
Job Description
Must Have Technical/Functional Skills
Technology and Programming (Expert Level)
  • Strong proficiency in Java full stack developer
  • Object-Oriented programming principles and concepts
  • Hands-on experience with Spring Framework (Spring Boot, Spring MVC, Spring Security)
  • Knowledge if RESTful API development
  • Experience with database like Oracle, DB2, MySQL
  • Proficiency in Payment Switch BASE24 EPS, C++, AS400 and Python is also added advantage
Domain, Cloud & Platform Engineering
  • Must have domain experience on Retail Point of Sale/Payment Systems/Merchandising/Inventory/Logistics area
  • Expertise in Microsoft Azure, including:
    • Compute (VMs, App Services, Azure Container Apps)
    • Containers & Orchestration (AKS, Docker)
    • Networking (VNETs, Private Endpoints, Application Gateway, Load Balancers)
    • Storage, Azure Key Vault, Azure Monitor, Log Analytics
  • Proven experience designing enterprise grade, highly available cloud platforms
DevOps & Engineering Excellence
  • Advanced experience with Azure DevOps and CI/CD pipeline architecture
  • Strong scripting skills (PowerShell, Bash)
  • GitOps concepts, branching strategies, release orchestration
Site Reliability Engineering (Leadership Level)
  • Ownership of platform reliability, resiliency, and performance
  • Definition and governance of:
    • SLIs, SLOs, SLAs
    • Error budgets and reliability metrics
  • Advanced observability strategy, designing and implementation:
  • Metrics, logs, traces, alerts, dashboards using Dynatrace
  • Incident response leadership, RCA facilitation, and long term remediation planning
  • Experience operating 99.9% 99.99% availability systems
Security, Compliance & Cost
  • Secure cloud design using Key Vault, managed identities, RBAC
  • Cost optimization (FinOps mindset) across cloud infrastructure
Roles & Responsibilities
  • Act as Lead SRE for client's Retail platforms, owning reliability and stability outcomes
  • Define and enforce SRE standards, best practices, and operating models
  • Architect and govern highly available, scalable cloud platforms
  • Lead the design and implementation of CI/CD and IaC strategies
  • Establish proactive monitoring, alerting, and incident prevention mechanisms
  • Own major incident leadership, RCA execution, and corrective action tracking
  • Partner with application, security, and architecture teams to build reliability by design
  • Drive automation to reduce toil and improve operational efficiency
  • Mentor and coach SRE and DevOps engineers across teams
  • Influence roadmap decisions with a reliability, scalability, and cost lens

About the Company

K

K&K Global Talent Solutions Inc.