The Platform Engineering group within Enterprise IOT supports the reliability, security, scalability, and efficiency of our applications, infrastructure, and devices in cloud and retail environments.
This role is new on our team. It is being added to help build highly automated capabilities for hands-off operation of a large, distributed fleet of Linux OS instances, in advance of initiatives that will result in tens of thousands of new devices in our retail stores.
The Linux Platform Engineer will play a significant role in defining needs, choosing technical approaches (utilities, libraries, etc.) and implementing tooling & automation. This will include ongoing collaboration with other Plat Eng and Product Dev engineers to collect input, review designs, and contribute requirements for scripting or development efforts by others in support of this work.
Required Skills -
Linux, Web Services, Containerization, Scripting
Job Duties -
Key areas of focus for this role include:
? OS standard build & image configuration, hardening, baked-in software, image
packaging
? Headless recovery capability self-testing, automated reimage/rebuild
? Observability health & performance monitoring, logging, integration with existing SaaS
platform
? No-touch management script common admin, maintenance, reporting, and
troubleshooting tasks that can be scheduled or wired into existing remote-trigger
mechanisms, with success/failure events and harvested information pushed into
logging/monitoring
? Certificate / key distribution and management contribute to overall system design,
build necessary OS-side tooling for the overall system.
The role also includes documentation tasks, as well as a small amount of day-to-day production support
and break-fix work, particularly in earlier phases before operations are fully automated.
Job Requirements -
? 5+ years of experience supporting Linux OS instances at scale (both cloud and bare metal / physical server experience a big plus), including:
o Deep knowledge of Linux fundamentals and troubleshooting, including process
execution & threads, memory usage, kernel & userland, system calls, signals / signal
handlers, storage, authentication
o Imaging, boot / partition management, and unattended build/rebuild
approaches and tools
o Automation of day-to-day configuration, maintenance, information gathering,
problem detection and resolution
o Performance monitoring, tuning, and troubleshooting for base OS functionality
and installed applications
o Experience with enterprise-grade IoT device management platforms, OTA
software update solutions, and secure device provisioning and authentication.
o Network fundamentals, configuration, and troubleshooting (wireless
networking a plus)
o Patch / vulnerability management, security best practices and tools,
o Application of zero touch provisioning and zero trust processes or
methodologies.
? Experience supporting modern web services & architectures, understanding of layered
protocol model, IP addressing, and flows for common services like DHCP, DNS, SSH,
HTTP/HTTPS, etc.
? Experience with containerized Linux environments, including:
o Docker - image builds, runtime support, segmentation/sharing between
container & host
o Kubernetes - network concepts, pod deployment, resource allocation,
monitoring, troubleshooting.
? Expert level shell scripting
Desired Skills & Experience -
? Python
? Monitoring platforms like New Relic or Datadog
Required Skills : Python,Linux
Basic Qualification :
Additional Skills : Systems Engineer,Software Developer
Background Check : No
Drug Screen : No