companyOpenAI logo

Software Engineer, Inference - Multi Modal

OpenAISan Francisco
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

Experience in developing and optimizing inference systems for large language models or multimodal frameworks. Familiarity with GPU-based machine learning workloads and an understanding of performance metrics for large models is essential. A passion for experimental and rapidly evolving work environments, coupled with a collaborative mindset, will be key to success in this role.

About the job

About Our Team

Join OpenAI’s dynamic Inference team, where we empower the deployment of cutting-edge AI models, including our renowned GPT models, advanced Image Generation capabilities, and Whisper, across diverse platforms. Our mission is to ensure these models are not only high-performing and scalable but also available for real-world applications. Collaborating closely with our Research team, we’re committed to bringing the next generation of AI innovations to fruition. As a compact, agile team, we prioritize delivering an exceptional developer experience while continuously pushing the frontiers of artificial intelligence.

As we expand our focus into multimodal inference, we are building the necessary infrastructure to support models that process images, audio, and other non-text modalities. This work involves tackling diverse model sizes and interactions, managing complex input/output formats, and ensuring seamless collaboration between product and research teams.

About The Role

We are seeking a passionate Software Engineer to aid in the large-scale deployment of OpenAI’s multimodal models. You will join a small yet impactful team dedicated to creating robust, high-performance infrastructure for real-time audio, image, and various multimodal workloads in production environments.

This position is inherently collaborative; you will work directly with researchers who develop these models and with product teams to define novel interaction modalities. Your contributions will enable users to generate speech, interpret images, and engage with models in innovative ways that extend beyond traditional text-based interactions.

Key Responsibilities:

  • Design and implement advanced inference infrastructure for large-scale multimodal models.
  • Optimize systems for high-throughput and low-latency processing of image and audio inputs and outputs.
  • Facilitate the transition of experimental research workflows into dependable production services.
  • Engage closely with researchers, infrastructure teams, and product engineers to deploy state-of-the-art capabilities.
  • Contribute to systemic enhancements, including GPU utilization, tensor parallelism, and hardware abstraction layers.

You May Excel In This Role If You:

  • Have a proven track record of building and scaling inference systems for large language models or multimodal architectures.
  • Possess experience with GPU-based machine learning workloads and a solid understanding of the performance dynamics associated with large models, particularly with intricate data types like images or audio.
  • Thrive in a fast-paced, experimental environment and enjoy collaborating with cross-functional teams to drive impactful results.

About OpenAI

OpenAI is at the forefront of artificial intelligence innovation, dedicated to advancing digital intelligence in a way that is safe and beneficial for humanity. Our team is composed of talented engineers and researchers who are passionate about pushing the boundaries of AI technology and making a profound impact in various domains.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.