← Back to opportunities

Data Infrastructure

📍 Location
Multiple Locations
⏰ Job Type
Full-time
📅 Posted
June 03, 2026

About the Role

**Overview**

Help build the world’s most advanced multimodal dataset at Microsoft AI

We are on a mission to create the largest and most advanced multimodal dataset in the world. This dataset, spanning all modalities from across the web and beyond, will power the training of the world’s most capable AI frontier models, pushing the boundaries of scale, performance, and product deployment.

The AI Data Infra team at Microsoft AI is responsible for building data infrastructure to help MAI teams to generate the biggest and best training dataset. Our work involves data pipelines, Spark, Ray, Vector Databases, and all other aspects of data infra.

We are looking for outstanding individuals excited about contributing to the next generation of systems that will transform the field. In particular, we are looking for candidates who:

Are passionate about the role of data in large-scale AI model training

Will thrive in a highly collaborative, fast...

Ready to Join Through a Referral?

Apply now and get connected directly with the hiring team

Apply for this Position