companyOpenAI logo

Inference Technical Lead, Sora

OpenAISan Francisco
Hybrid Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Manager

Qualifications

Qualifications: Extensive experience in model performance optimization, especially in inference. Strong background in kernel systems, data movement, and performance tuning. Enthusiasm for developing scalable AI systems for multimodal applications. Ability to navigate complex technical challenges and deliver innovative solutions.

About the job

Join the Sora Team at OpenAI

The Sora team is at the forefront of developing multimodal capabilities within OpenAI’s foundational models. We are a dynamic blend of research and product development, committed to integrating sophisticated multimodal functionalities into our AI offerings. Our focus is on delivering solutions that are not only reliable and intuitive but also resonate with our mission to foster broad societal benefits.

Your Role as Inference Technical Lead

We are seeking a talented GPU Inference Engineer to enhance the model serving efficiency for Sora. This pivotal position will empower you to spearhead initiatives aimed at optimizing inference performance and scalability. You will collaborate closely with our researchers to design and develop models that are optimized for inference, directly contributing to the success of our projects.

Your contributions will be vital in advancing the team’s overarching objectives, allowing leadership to concentrate on high-impact initiatives by establishing a robust technical foundation.

Key Responsibilities:

  • Enhance model serving, inference performance, and overall system efficiency through focused engineering efforts.

  • Implement optimizations targeting kernel and data movement to boost system throughput and reliability.

  • Collaborate with research and product teams to ensure our models operate effectively at scale.

  • Design, construct, and refine essential serving infrastructure to meet Sora’s growth and reliability demands.

You Will Excel in This Role If You:

  • Possess deep knowledge in model performance optimization, particularly at the inference level.

  • Have a strong foundation in kernel-level systems, data movement, and low-level performance tuning.

  • Are passionate about scaling high-performing AI systems that address real-world, multimodal challenges.

  • Thrive in ambiguous situations, setting technical direction, and driving complex projects to fruition.

This role is based in San Francisco, CA. We follow a hybrid work model requiring 3 in-office days per week and offer relocation assistance to new hires.

About OpenAI

About OpenAI: OpenAI is a pioneering research and deployment company focused on advancing artificial intelligence for the benefit of all humanity. We are dedicated to exploring the limits of AI capabilities and ensuring the safe deployment of our technologies through innovative products. Our mission is to create AI that is beneficial and accessible to everyone.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.