Apple is where individual imaginations gather together, committing to the values that lead to great work. Every new product we build, service we create, or experience we deliver is the result of us making each other's ideas stronger. The diversity of our people and their thinking inspires the innovation that runs through everything we do. When we bring everybody in, we can do the best work of our lives. Here, you'll do more than join something - you'll add something.
The Special Projects team at Apple is developing novel user-facing conversational features that leverage the multimodal capabilities of state-of-the-art foundation models. As part of this process, we generate real-world and simulated data, gather human data annotations, analyze the results, and use them to build and evaluate Large Language Model judges. We are looking for a skilled Data Scientist to join our Machine Learning Evaluations teams. This person will work closely with ML Engineers to manage and analyze our human and automated data annotation processes, and to develop, test, and refine LLM judges for generative AI model evaluation. A successful candidate is experienced in survey design, data annotation, LLM prompt engineering and prompt optimization, and has strong statistical analysis skills.
Work closely with ML Engineers to understand data annotation needs
Design and manage data annotation processes, including the development of user instructions, annotation pipeline processing, and process improvement
Develop LLM auto-judges and judging criteria for generative AI model evaluation
Analyze collected data annotations to assess and refine LLM auto-judges
BA or Master's degree in Data Science, Statistics, or a quantitative social science field
2+ years of hands-on experience working in survey design and human data annotation
Proficiency in Python
Excellent communication skills
PhD in Data Science, Statistics, or a quantitative social science field
Hands-on industry experience with product-focused statistical analysis
Experience working with large-scale multimodal data and data-annotation pipelines
Experience with LLM prompt engineering & prompt optimization
Experience with LLM auto-judges for generative AI model evaluation
A track record of publications or technical presentations in Data Science or a related field
Excellent at cross-functional collaboration
We’re a diverse collection of thinkers and doers, continually reimagining what’s possible to help us all do what we love in new ways. The people who work here have reinvented entire industries with the Mac, iPhone, iPad, and Apple Watch, as well as with services, including iTunes, the App Store, Apple Music, and Apple Pay. And the same passion for innovation that goes into our products also applies to our practices — strengthening our commitment to leave the world better than we found it.
There’s a place here for every kind of brilliant. Everyone here is an innovator, or an innovator-to-be, no matter what your team or your role. So bring your passion, courage, and original thinking and get ready to share it, because every new product, service, or feature we invent is the result of people working together to make each others’ ideas stronger. Innovation at this level depends on people who represent the variety of the human experience and inspire us with their own fresh perspectives. Together, we’ll do amazing work that can make a difference in people’s lives. Including your own. Learn more about working at Apple.