← Back to opportunities
📍 Location
Seattle
⏰ Job Type
Full-time
📅 Posted
June 03, 2026

About the Role

**Role Number:** 200664012-3337

**Summary**
We're building the evaluation platform that will serve all of Apple's generative AI and agent systems. Evaluating non-deterministic AI systems is one of the hardest unsolved problems in production ML — and one Apple has to get right at scale. We're building the platform that makes it tractable for every team here.
This is a hands-on engineering role with a lot of autonomy. You'll write a lot of Python and own meaningful pieces of the platform end-to-end. You'll be partnering closely with research engineers, model and serving teams, product and feature teams, and the infra and data platform groups this work integrates with.

**Description**
Build and ship: Take ownership of features and services within the evaluation platform: APIs, SDKs, orchestration components, evaluation runners. You'll have the room to make calls on your own work and the support to deliver it well.
Productionize ML research: Partner with rese...

Ready to Join Through a Referral?

Apply now and get connected directly with the hiring team

Apply for this Position