Senior Researcher - Efficient AI

Microsoft Corp

Redmond, WA

JOB DETAILS
SALARY
$119,800–$234,700 Per Year
SKILLS
Algorithms, Analysis Skills, Artificial Intelligence (AI), Background Investigation, Benchmarking, C++ Programming Language, CUDA (Compute Unified Device Architecture), Caching, Capacity Management, Cloud Computing, Compensation and Benefits, Debugging Skills, GPU (Graphics Processing Unit), Government Requirements, Kernel Programming, Large-Scale Systems, Machine Learning, Microsoft Product Family, Open Source, Operational Audit, Optimization Algorithm, Parallel Computing, Patents, Prototyping, Publications, Python Programming/Scripting Language, Research Skills, Software Design, Strategic Planning, Technical Writing, Technical/Engineering Design
LOCATION
Redmond, WA
POSTED
30+ days ago

Overview

Generative AI is transforming how people create, collaborate, and communicate-redefining productivity across Microsoft 365 for customers worldwide. At Microsoft, we operate one of the largest collaboration and productivity platforms in the world, serving hundreds of millions of consumer and enterprise users. Delivering these AI experiences at scale requires solving some of the hardest efficiency challenges in modern AI systems.

We are an applied research team focused on advancing efficiency across the AI stack, spanning models, ML frameworks, cloud infrastructure, and hardware. We drive mid- and long-term product innovation through close collaboration with research and product teams across the company. We communicate our research both internally and externally through internal technical reports, academic conference publications, open-source releases, and patents. Beyond producing research, we take responsibility for driving ideas through prototyping, validation, and production, with a bias toward real-world impact.

The Candidate

The candidate will work across the full stack-from large-scale serving systems to hardware- and kernel-level optimizations-exploring algorithmic, systems, and hardware/software co-design techniques. Areas of focus include batching, routing, scheduling, caching, endpoint configuration, and GPU architecture-aware optimizations. This role emphasizes end-to-end ownership, with responsibility for identifying high-impact problems and driving research ideas through prototyping, validation, and deployment to deliver measurable customer impact.

Responsibilities

  • Formulate, develop, and evaluate new algorithmic and system-level approaches for end-to-end AI serving, using analytical modeling and large-scale measurement to study token-level latency, tail latency (p95/p99), throughput-per-dollar, cold-start behavior, warm pool strategies, and capacity planning under multi-tenant SLOs and variable sequence lengths.
  • Design and experimentally evaluate endpoint configuration and execution policies, including batching, routing, and scheduling strategies, tensor and pipeline parallelism, quantization and precision profiles, speculative decoding, and chunked or streaming generation, and drive the most promising approaches through robust rollout and validation into production.
  • Perform hardware- and kernel-aware optimization by collaborating closely with model, kernel, compiler, and hardware teams to align serving algorithms with attention/KV innovations and accelerator capabilities.
  • Build and benchmark experimental prototypes and large-scale measurements to validate research ideas and drive them toward production readiness; produce clear technical documentation, design reviews, and operational playbooks.
  • Publish research results, file patents, and, where appropriate, contribute to open-source systems and serving frameworks.

Qualifications

Required Qualifications

  • Doctorate in relevant field OR Masters Degree in relevant field AND 3+ years related research experience OR Bachelors Degree in relevant field AND 4+ years related research experience OR equivalent experience.
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings:
  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

Preferred Qualifications

  • Demonstrated experience in designing and optimizing efficient inference systems, combining foundations in algorithmic optimization, parallel computing, and request orchestration under strict SLO constraints with deep knowledge of attention and KV‑cache optimizations, batching and scheduling strategies, and cost‑aware deployment.
  • 3+ years of experience with machine learning frameworks (e.g., PyTorch, TensorFlow) and inference serving frameworks (e.g., vLLM, Triton Inference Server, TensorRT-LLM, ONNX Runtime, Ray Serve, DeepSpeed-MII).
  • 3+ years of experience in GPU programming and optimization, with expert knowledge of CUDA, ROCm, Triton, PTX, CUTLASS, or similar GPU programming frameworks.
  • Proficiency in C++ and Python for high-performance systems, with code quality and profiling/debugging skills.
  • Research impact through publications and/or patents, coupled with hands‑on experience taking research ideas through execution and delivery in production.

Benefits and Compensation

The typical base pay range for this role across the U.S. is USD $119,800 - $234,700 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $158,400 - $258,000 per year. Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.

About the Company

M

Microsoft Corp

DO WHAT YOU LOVE
Make your mark on the world’s most used technologies. Develop the next hit mobile application. Pioneer a startup that could be the next big thing. At Microsoft, you choose your path.

Headquartered in Redmond, Washington, Microsoft is a top innovator in both the consumer and enterprise technology industry. Just a few of the many things our products do are unleash creativity, connect businesses, and make learning more fun. But our continued success is based on one thing: our employees. We hire amazing, talented people and give them the opportunities—and the tools—to succeed.

WHY MICROSOFT?
As a Microsoft employee, you’re surrounded by a diverse group of the smartest people in your field. This fosters new ideas, better business results, and creates a dynamic work environment. In the office, you’re constantly challenged and supported by your colleagues. Every day holds something new and exciting.

We also offer unparalleled depth and breadth of career opportunities. As an industry leader in multiple fields, working for Microsoft means being able to do whatever you feel passionate about—and being able to make an impact in that field. From day one, we give our employees significant responsibility. This means that you’ll know that you directly contributed to something that has a positive impact on people worldwide. Whether you choose to work in management, dive deep into the newest technology, or explore multiple professions, you’ll find everything you need at Microsoft to drive your career—and to make a difference.

WE GET IT – YOU’RE MORE THAN YOUR JOB
Everyone works differently and is motivated by different things. We also understand that there’s more to you than your job. That’s why we offer competitive pay and a wide assortment of benefits-- to help you make the most of life at work and away from it.

GET THE BALL ROLLING
COMPANY SIZE
10,000 employees or more
INDUSTRY
Computer Software
FOUNDED
1975
WEBSITE
http://www.microsoft.com