About the job
Join the Sora Team
At Sora, we are at the forefront of integrating video capabilities into OpenAI’s foundational models. Our innovative hybrid research and product team is dedicated to expanding the boundaries of video model capabilities while ensuring their reliability and safety. We achieve this through rigorous research, experimentation, and real-world deployment, aiming to disseminate our advancements to a broader audience.
Your Role as a Distributed Systems/ML Engineer
In this pivotal role, you will be instrumental in enhancing the training throughput of our internal framework, empowering researchers to experiment with cutting-edge ideas. Your responsibilities will encompass designing, implementing, and optimizing state-of-the-art AI models, ensuring that your machine learning code is bug-free, and leveraging your expertise in supercomputer performance. We seek individuals who are passionate about performance optimization, possess a deep understanding of distributed systems, and have a zero-tolerance policy for bugs in code.
This position is based in San Francisco, CA, following a hybrid work model with three days in the office each week. We also provide relocation assistance for new team members.
Key Responsibilities:
Collaborate closely with researchers to facilitate the development of systems-efficient video models and architectures.
Implement the latest techniques within our training framework to achieve exceptional hardware efficiency during training runs.
Profile and optimize our training framework to ensure peak performance.
You Will Excel in This Role If You:
Possess experience with multi-modal machine learning pipelines.
Enjoy delving into system implementations and grasping their fundamentals to enhance performance and maintainability.
Demonstrate strong software engineering expertise and proficiency in Python.
Have experience in understanding and optimizing training kernels.
Are eager to explore stable training dynamics.
About OpenAI
OpenAI is a pioneering AI research and deployment organization committed to ensuring that general-purpose artificial intelligence is beneficial for all of humanity. We continually push the boundaries of what is possible with AI, striving to create a positive impact in various fields.

