ARM (Advanced RISC Machine), Apple, Application Programming Interface (API), Artificial Intelligence (AI), Attorney, Automotive Industry, Automotive Manufacturing, Benchmarking, Best Practices, Budget Management, Budgeting, C++ Programming Language, CUDA (Compute Unified Device Architecture), Caching, Cisco Unity, Code Reviews, Communication Skills, Cost Modeling, Desktop PC, Ecosystems, Embedded Hardware, English Language, Federal Trade Commission, GPU (Graphics Processing Unit), Government, Graphics, Healthcare, Intel Product Family, JavaScript, Kernel Programming, Laptop PC, Legal, Machine Learning, Machine Tool, Memory Hardware, Mentoring, Mobile Devices, Modeling Languages, Objective-C Programming Language, Open Source, OpenGL Programming Libraries, Performance Engineering, Performance Metrics, Product Planning, Prototyping, Python Programming/Scripting Language, Recruiting/Staffing Agency, Scientific Research, Software Engineering, Stock Keeping Unit (SKU), System Integration (SI), Team Player, Technical Support, Telemetry, Web Browsers
b''Senior Machine Learning Engineer, On-Device & Mobile AI Optimization, San Francisco, CA, USA - Unity Careers
Requisition ID: JOBREQ-2616041
Senior Machine Learning Engineer, On-Device & Mobile AI Optimization
San Francisco, CA, USA, Full-time
- Unity Careers
- Positions
- Description
ALERT: Unity has received reports of scams where individuals purporting to be Unity HR representatives conduct bogus employment interviews via email or text, and then request payment as a condition for receiving an offer of employment. Please be aware that Unity does not conduct interviews by email or text, and will never request payment as a condition for applying for a position or receiving an offer of employment. These scam operators may also ask for your personal information (name, address, birthdate, social security number, etc.) which you should not provide to them. If you have been a target of such a scam, you should report it by contacting the U.S. Federal Trade Commission (see this FTC posting for further details) the office of your state Attorney General, or the government agency responsible for investigating matters such as this where you reside this FTC posting for further details) the office of your state Attorney General, or the government agency responsible for investigating matters such as this where you reside.nn
See more
See FTC
Dismiss
- The opportunity
- fast, small,
- and reliably
- What you'll be doing
- What we're looking for
- You might also have
- Additional information
- Benefits
- Life at Unity
- Apply
The opportunity
We are building the next generation of AI-driven game experiences, running generativexc2xa0models on-device, right where the players are xe2x80x94 on phones, tablets, laptops, andxc2xa0desktops. Our games run inside a modern, browser-native runtime (built on technologiesxc2xa0such as WebGPU and WebNN), so the models that power these experiences must be deployedxc2xa0and accelerated entirely within that runtime. As a Senior Machine Learning Engineer forxc2xa0On-Device & Mobile AI, you will take state-of-the-art multi-modal models xe2x80x94 transformers,xc2xa0diffusion networks, and vision-language models (VLMs) xe2x80x94 and make them run
fast, small,
and reliably
on mobile and constrained hardware.n
This is a deeply hands-on role. You will own the optimization and deployment of significantxc2xa0parts of the inference stack xe2x80x94 from a trained checkpoint leaving research, through export,xc2xa0quantization, and kernel-level tuning, to a shipped feature running inside the engine atxc2xa0interactive frame rates within a fixed memory and power budget. Your work directly shapesxc2xa0the latency, quality, memory footprint, and battery profile of AI features experienced byxc2xa0billions of players.
n
This role is for an engineer who is energized by the gap between a research model and axc2xa0shipping, on-device product. If you enjoy profilers, frame captures, op-fusion, and shavingxc2xa0milliseconds and megabytes, this is your role.
n
What you''ll be doing
n
n
- Inference & On-Device Optimization
n
- Own the optimization pipeline for the models you ship: model export, graphxc2xa0transformation, operator fusion, memory-layout planning, and hardware-specific tuningxc2xa0across NPU, mobile GPU, and desktop/laptop GPU.
n
- Apply quantization (INT4/INT8/FP16), weight sharing, structured/unstructured pruning,xc2xa0and knowledge distillation to hit hard latency, memory, and power budgets xe2x80x94 and validatexc2xa0them against quality bars.
n
- Do low-level performance work: write and tune WebGPU compute shaders (WGSL) and,xc2xa0where relevant, native kernels (Metal, Vulkan/SPIR-V compute, CUDA); profile with browserxc2xa0and platform tools (Chrome/Dawn GPU traces, PIX, Instruments/Metal System Trace,
n
- Snapdragon Profiler, Nsight, RenderDoc), and eliminate bottlenecks at the op andxc2xa0memory-bandwidth level.
n
- Apply efficiency techniques xe2x80x94 dynamic resolution, token reduction, cross-framexc2xa0caching/reuse, reduced-step diffusion samplers xe2x80x94 as engineering levers to meet budgetsxc2xa0on target SKUs.
n
- Runtime & Systems Integration
n
- Work with WebGPU-targeted inference runtimes (ONNX Runtime Web, Transformers.js,xc2xa0WebLLM, TensorFlow.js) alongside native options (CoreML, ONNX Runtime, TFLite,xc2xa0ExecuTorch), and extend or build glue code where off-the-shelf options fall short of ourxc2xa0diffusion and VLM workloads.
n
- Build parts of the integration between the ML runtime and the game engine: real-timexc2xa0scheduling, memory pooling, zero-copy buffer sharing between the inference and renderxc2xa0paths, and frame-budget management alongside the renderer.
n
- Build supporting engineering for your components: model packaging and asset pipelines,xc2xa0on-device fallbacks and SKU-aware capability tiers, crash/quality telemetry, and automatedxc2xa0on-device benchmarking in CI.
n
- Research Productionization
n
- Partner with research scientists to turn novel CV and multi-modal architectures intoxc2xa0implementations that are deployable, debuggable, and fast on device.
n
- Provide a feedback loop into research: surface hardware constraints, op-support gaps, andxc2xa0cost models early so model design and deployment converge.
n
- Track breakthroughs in efficient inference (efficient attention, distillation, reduced-stepxc2xa0diffusion) and assess them pragmatically: what actually moves latency/memory/power onxc2xa0our target devices.
n
- Collaboration & Engineering Quality
n
- Contribute to engineering best practices, code-review standards, performance-regressionxc2xa0gates, and on-device benchmarking methodology.
n
- Support a culture of measurement: track KPIs for latency, quality, memory, and power forxc2xa0the systems you work on, across the device matrix.
n
- Partner with platform engineers, product managers, and runtime teams to align your workxc2xa0with device-SKU constraints and product roadmaps.
n
- Share knowledge and mentor junior and mid-level engineers through code review, pairing,xc2xa0and design discussion.
n
n
What we''re looking for
n
n
- 5+ years in software/ML engineering, with meaningful time focused on on-device / edgexc2xa0inference or real-time, performance-critical systems.
n
- Production deployment of transformer- and/or diffusion-based models (e.g., ViT, Stablexc2xa0Diffusion, CLIP/SigLIP-style encoders) on mobile, desktop, or embedded hardware xe2x80x94xc2xa0shipped, not just prototyped.
n
- Hands-on experience with at least one major inference runtime (ONNX Runtime / ORT Web,xc2xa0CoreML, TFLite, ExecuTorch) and a working understanding of operator fusion, memoryxc2xa0layout, and runtime scheduling.
n
- Low-level performance engineering: solid command of at least one GPU/compute API xe2x80x94 WebGPU/WGSL, Metal, Vulkan, D3D12, or CUDA xe2x80x94 and the profiling tools to go with it.xc2xa0 You can read a frame capture and a kernel trace and reason about where the time and memoryxc2xa0go.
n
- Working knowledge of model-optimization techniques xe2x80x94 quantization (INT4/INT8/FP16),xc2xa0weight sharing, pruning, and distillation xe2x80x94 and the judgment to apply them to hit latencyxc2xa0and memory budgets. You use them effectively as engineering tools.
n
- Understanding of target hardware: mobile SoCs (Apple Neural Engine, Qualcommxc2xa0Hexagon/Adreno, ARM Mali) and/or desktop/laptop GPUs (Apple Silicon, NVIDIA, AMD,xc2xa0Intel).
n
- Strong Python for export pipelines and training-side tooling; familiarity with the corexc2xa0languages of a browser-native runtime (TypeScript/JavaScript, WGSL) is a plus.
n
- Working fluency with the models you deploy xe2x80x94 enough to read an architecture, modify it forxc2xa0deployment, and reason about accuracy trade-offs.
n
- A collaborative working style: clear communication, reliable delivery, and a willingness toxc2xa0support and learn from teammates.
n
n
You might also have
n
n
- Experience shipping world-model, neural-rendering, or real-time generative pipelinesxc2xa0NeRF, 3DGS, real-time diffusion, or similar) on device.
n
- Hands-on experience deploying models through WebGPU xe2x80x94 e.g., ONNX Runtime Webxc2xa0WebGPU EP), Transformers.js, WebLLM, or TensorFlow.js xe2x80x94 including writing/tuning WGSLxc2xa0compute shaders.
n
- Game-engine or real-time-graphics background (Unity, Unreal, or a custom engine;xc2xa0Metal/Vulkan/D3D/OpenGL ES render pipelines) xe2x80x94 especially integrating computexc2xa0workloads alongside a renderer.
n
- Contributions to open-source ML inference frameworks, runtimes, or GPU/compute librariesxc2xa0especially in the WebGPU ecosystem (Dawn, wgpu, ORT Web, Transformers.js, WebLLM).
n
- Familiarity with compiler stacks (MLIR, TVM, IREE, XLA) for custom kernel generation andxc2xa0graph optimization.
n
- Experience with on-device benchmarking infrastructure, performance-regression CI, andxc2xa0device-farm matrices.
n
- Proficiency in C++/Objective-C/Swift for runtime integration.
n
n
Additional information
n
n
- Relocation support is not available for this position
n
- Work visa/immigration sponsorship is not available for this position
n
n
Benefits
At Unity, we want our team members to thrive. We offer a wide range of benefits designed to support well-being and work-life balance.n
Please note: Benefits eligibility, specific offerings, and coverage vary based on the country and employment status.
n
While specific benefits vary, here are some of the ways we strive to take care of our eligible team members globally: Comprehensive health, life, and disability insurance | Commute subsidy | Employee stock ownership | Competitive retirement/pension plans | Generous vacation and personal days | Support for new parents through leave and family-care programs | Office food snacks | Mental Health and Wellbeing programs and support | Employee Resource Groups | Global Employee Assistance Program | Training and development programs | Volunteering and donation matching program
n
Life at Unity
Unity [NYSE: U] is the worldxe2x80x99s leading game engine, powering play for more than 3 billion consumers each month. The top mobile games in the world, the most played PC indie titles, the most innovative console games, and virtually all of the top XR and Web Games are developed, deployed, and grown in Unity. Unity also enables teams across industries like automotive, manufacturing, and healthcare to design, simulate, and collaborate in 3D xe2x80x94 closing the gap between ideas and reality. For more information, please visit www.unity.com.n
Unity is a proud equal opportunity employer. We are committed to fostering an inclusive, innovative environment and celebrate our employees across age, race, color, ancestry, national origin, religion, disability, sex, gender identity or expression, sexual orientation, or any other protected status in accordance with applicable law. Our differences are strengths that enable us to support the growing and evolving needs of our customers, partners, and collaborators. If you have a disability that means there are preparations or accommodations we can make to help ensure you have a comfortable and positive interview experience, please fill out this form to let us know.
n
Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.
n
This position requires the incumbent to have a sufficient knowledge of English to have professional verbal and written exchanges in this language since the performance of the duties related to this position requires frequent and regular communication with colleagues and partners located worldwide and whose common language is English.
n
Headhunters and recruitment agencies may not submit resumes/CVs through this Web site or directly to managers. Unity does not accept unsolicited headhunter and agency resumes. Unity will not pay fees to any third-party agency or company that does not have a signed agreement with Unity.
n
Your privacy is important to us. Please take a moment to review our Prospect Privacy Policy and Applicant Privacy Policy. Should you have any concerns about your privacy, please contact us at DPO@unity.com.
n
#SEN #LI-MC1
- Note: This range reflects the anticipated base salary for this position. Beyond base salary, this role may be eligible for equity awards and participation in our company incentive plans (such as annual discretionary bonuses or sales commissions). The final offer amount will depend on several factors, including geographic location and the candidatexe2x80x99s relevant experience, professional background, and skill set.nxc2xa0Gross pay salary $188,200xe2x80x94$282,200 USD
Apply
Location: San Francisco, CA, USADepartment: AI & Machine LearningType: Full-timeRequisition ID: JOBREQ-2616041
''