Systems Development Engineer, Managed Edge Compute (Amazon Robotics)

Amazon.com Inc

Austin, TX

JOB DETAILS
SKILLS
ARM (Advanced RISC Machine), Amazon Web Services (AWS), Architectural Services, Artificial Intelligence (AI), Automation, Automotive Repair and Maintenance, BSP, Best Practices, Booting, Building Systems, Business Operations, Cloud Computing, Code Reviews, Concrete, Debugging Skills, Debugging Tools, Design Document, Device Drivers, Distributed Computing, Documentation, Ecosystems, Embedded Systems, Error Handling, Fleet Management, GPU (Graphics Processing Unit), Hardware Quality Assurance, Home Automation, Identify Issues, Industrial Robotics, Input/Output, Internet of Things, Kernel Programming, Keyboards, Linux Administration, Linux Kernel, Linux Operating System, Logistics, Machine Tool, Memory Hardware, Metrics, Operating Systems, Order/Customer Fulfillment, Problem Solving Skills, Process Improvement, Python Programming/Scripting Language, Risk Analysis, Robotics, Rockwell Automation, Rust Programming Language, Software Distribution, System Operations, Systems Administration/Management, Systems Engineering, Systems Scalability, Technical Strategy, Technical Writing, Technical/Engineering Design, Telemetry, Testing, Thin Clients, Time Management, Vehicle Fleets, x86 Processors
LOCATION
Austin, TX
POSTED
4 days ago

We"re seeking a Systems Development Engineer to join the Unified Workcell Compute team. This is a hands-on, high-impact role where you"ll design and build systems that manage Amazon"s edge device fleet - over a million devices across thousands of locations worldwide. You"ll work at the intersection of cloud infrastructure, device management, robotics systems, and operational excellence, solving complex technical problems that enable Amazon"s robotics and fulfillment operations to scale globally.

As a SysDE II, you"ll be a strong individual contributor who delivers high-quality technical solutions, contributes to architectural discussions, and builds reliable systems that enable robotics and automation teams to deploy and manage their edge compute solutions with the same ease as deploying to AWS. You"ll work within established technical strategies while identifying opportunities for improvement, translating well-scoped business problems into concrete technical solutions, and balancing short-term delivery with long-term system health. This role requires solid technical depth across multiple domains - Linux systems, AWS services, IoT platforms, robotics compute infrastructure, and distributed systems - combined with the ability to partner effectively with engineers across the team and organization.

Key job responsibilities

  • Build and maintain resilient, scalable distributed systems that operate at Amazon scale, contributing to the management of robotics device fleets across thousands of sites with 99.99%+ availability requirements.
  • Contribute to the technical strategy for your team"s systems within the UWC architecture, participating in decisions around hyperscale deployments, robotics compute patterns, fleet management, and edge device automation.
  • Participate in architectural reviews and design discussions across UWC and robotics customer teams, contributing technical input on device lifecycle management, software distribution, multi-compute workcell assistance, and operational excellence patterns.
  • Develop automation solutions using Python, Rust, CDK, and AWS services that eliminate entire classes of operational load and enable self-service for robotics solution teams.
  • Implement and optimize Linux-based systems, OS image creation pipelines (Yocto/mkosi), and BSP solutions for diverse robotics hardware platforms including x86, ARM, NVIDIA GPU systems, and embedded devices.
  • Create tooling and frameworks that enable robotics teams to provision, configure, and manage their edge compute fleets - from AI perception systems to manipulation robotics - with minimal hands-on-keyboard time.
  • Apply established standards for engineering, testing, and operational excellence best practices, and suggest improvements to processes within your team.
  • Identify and implement opportunities to streamline or eliminate excess processes, improving agility and reducing complexity for robotics teams building on UWC.
  • Proactively identify and escalate risks at the product and service level, contributing to the resilience, performance, and cost efficiency of UWC systems aiding critical robotics operations.
  • Troubleshoot complex production issues across the full stack - from robotics device hardware and Linux kernel to AWS cloud services - identifying patterns and implementing solutions that prevent future incidents.
  • Partner with robotics solution teams (Amazon Robotics, manipulation systems, AI perception, workcell automation) to translate their device management challenges and contribute to solutions that meet their specific requirements.
  • Foster the growth of peers on your team through code reviews, knowledge exchange, and collectively problem-solving that raises the technical bar.
  • Deliver solutions that are inventive, resilient, and extensible, making it easier for robotics teams to build on UWC.
  • Participate in hiring and contribute to technical assessm

A day in the life

Your day might start by investigating an issue where robotics devices across multiple fulfillment centers are experiencing intermittent kernel panics during high-load operations. You dive deep into kernel logs, memory dumps, and device telemetry, correlating the failures with a recent driver update for NVIDIA GPU systems. You develop a Python or Rust-based diagnostic tool to capture more granular system metrics and partner with senior engineers to roll back the problematic driver version while working on a fix that addresses the underlying memory management issue.

Mid-morning, you"re troubleshooting why a new OS image isn"t booting correctly on ARM-based manipulation robotics devices. You boot into a recovery environment, examine the initramfs, trace through systemd unit reliances, and discover a race condition in the device initialization sequence. You modify the Yocto recipe to fix the boot ordering, test across multiple hardware variants, and document the pattern for other teams building custom images. You then join a sync with an Amazon Robotics team to help them debug why their software components are failing to deploy - walking through IoT certificate validation, network linkage from the edge device, and AWS IAM permissions until you identify a misconfigured security group.

After lunch, you"re participating in a code review for a new credential rotation service - providing written feedback on error handling patterns, memory safety, and how to better structure the state machine for resilience. You spend time optimizing a Linux system configuration that"s causing performance bottlenecks on AI perception systems - configuring and tuning Linux system parameters to enable high-performance compute workloads. You pair with a teammate who"s working through a complex Yocto build failure, exchanging what you know about layer reliances and BitBake recipe inheritance while partnering on debugging techniques.

The afternoon includes answering to a page where devices in a specific building can"t link to AWS IoT Core. You systematically eliminate possibilities - checking DNS resolution, testing TLS handshakes, examining certificate chains, and analyzing network packet captures - until you discover a misconfigured firewall rule blocking MQTT traffic. You implement a monitoring enhancement to detect this class of issue proactively across all sites. You then contribute to a technical design document proposing improvements to UWC"s device provisioning workflow that will reduce provisioning time from 20 minutes to under 10 minutes by parallelizing certificate generation and optimizing the Linux boot sequence. You"ll end your day reviewing system metrics across the fleet, flagging devices with degraded disk I/O that need proactive maintenance, and syncing with your team on priorities for tomorrow.

Amazon offers a full range of benefits that support you and eligible family members, including domestic partners. Benefits can vary by location, the number of regularly scheduled hours you work, length of employment, and job status such as seasonal or temporary employment. The benefits that generally apply to regular, full-time employees include:

  1. Medical, Dental, and Vision Coverage

  2. Maternity and Parental Leave Options

  3. Paid Time Off (PTO)

  4. 401(k) Plan

If you are not sure that every qualification on the list above describes you exactly, we"d still love to hear from you! At Amazon, we value people with unique backgrounds, experiences, and skillsets. If you're passionate about this role and want to make an impact on a global scale, please apply!

About the team

The Unified Workcell Compute (UWC) team is at the forefront of Amazon"s robotics and automation efforts, building and operating the foundational device management platform for Amazon"s on-premise edge compute fleet. Our services manage over a million robotic devices across thousands of locations worldwide - from the latest NVIDIA GPU offerings enabling AI perception efforts to bleeding-edge manipulation robotics systems, industrial PCs, thin clients, Drive Units, and embedded devices across Amazon"s global fulfillment network.

Our mission is to enable robotics solution teams to deploy to Operations buildings with the same self-service, ownership, and accountability as deploying to AWS cloud. We"re revolutionizing Amazon"s logistics and fulfillment operations by pushing the boundaries of what"s possible in automation and compute management at unprecedented scale.

We"re a team of builders who value automation, operational excellence, and customer obsession. We own a critical technology ecosystem that powers device provisioning, software distribution, credential management, and fleet operations for robotics workcells and fulfillment systems. Our work directly impacts millions of customer orders and enables Amazon"s promise to fast, reliable delivery. We"re solving problems that few organizations face, building systems that have never existed before, and defining the future of edge compute management for robotics at Amazon scale.

We foster a culture that encourages personal and professional growth, empowering our team members to continually expand their skills and knowledge. Work-life balance is a priority for us, and we strive to create an environment where our team can thrive both professionally and personally.

About the Company

A

Amazon.com Inc

At Amazon, we don’t wait for the next big idea to present itself. We envision the shape of impossible things and then we boldly make them reality. So far, this mindset has helped us achieve some incredible things. Let’s build new systems, challenge the status quo, and design the world we want to live in. We believe the work you do here will be the best work of your life.

Wherever you are in your career exploration, Amazon likely has an opportunity for you. Our research scientists and engineers shape the future of natural language understanding with Alexa. Fulfillment center associates around the globe send customer orders from our warehouses to doorsteps. Product managers set feature requirements, strategy, and marketing messages for brand new customer experiences. And as we grow, we’ll add jobs that haven’t been invented yet.

It’s Always Day 1
At Amazon, it’s always “Day 1.” Now, what does this mean and why does it matter? It means that our approach remains the same as it was on Amazon’s very first day – to make smart, fast decisions, stay nimble, invent, and stay focused on delighting our customers. In our 2016 shareholder letter, Amazon CEO Jeff Bezos shared his thoughts on how to keep up a Day 1 company mindset. “Staying in Day 1 requires you to experiment patiently, accept failures, plant seeds, protect saplings, and double down when you see customer delight,” he wrote. “A customer-obsessed culture best creates the conditions where all of that can happen.” You can read the full letter here

Our Leadership Principles
Our Leadership Principles help us keep a Day 1 mentality. They aren’t just a pretty inspirational wall hanging. Amazonians use them, every day, whether they’re discussing ideas for new projects, deciding on the best solution for a customer’s problem, or interviewing candidates. To read through our Leadership Principles from Customer Obsession to Bias for Action, visit https://www.amazon.jobs/principles
COMPANY SIZE
10,000 employees or more
INDUSTRY
Retail
FOUNDED
1994
WEBSITE
http://Amazon.com/militaryroles