About the job
About Our Team
Join the Foundations Research team, where we tackle ambitious and innovative projects that could redefine the future of AI. Our mission is to enhance the science behind our training and scaling initiatives, focusing on pioneering frontier models. We are dedicated to advancing data utilization, scaling methodologies, optimization strategies, model architectures, and efficiency enhancements to accelerate our scientific breakthroughs.
About the Position
We are on the lookout for a dynamic technical research lead to spearhead our embeddings-focused retrieval initiatives. You will oversee a talented team of research scientists and engineers committed to developing foundational technologies that enable models to access and utilize the right information precisely when needed. This includes crafting innovative embedding training objectives, architecting scalable vector storage, and implementing adaptive indexing techniques.
This pivotal role will contribute to various OpenAI products and internal research initiatives, offering opportunities for scientific publication and significant technical influence.
This position is located in San Francisco, CA, where we embrace a hybrid work model, requiring three days in the office weekly, and we provide relocation assistance for new hires.
Your Responsibilities
Lead cutting-edge research on embedding models and retrieval systems optimized for grounding, relevance, and adaptive reasoning.
Supervise a team of researchers and engineers in building an end-to-end infrastructure for training, evaluating, and integrating embeddings into advanced models.
Drive advancements in dense, sparse, and hybrid representation techniques, metric learning, and retrieval systems.
Work collaboratively with Pretraining, Inference, and other Research teams to seamlessly integrate retrieval throughout the model lifecycle.
Contribute to OpenAI's ambitious vision of developing AI systems with robust memory and knowledge access capabilities rooted in learned representations.
You Will Excel in This Role If You Possess
A proven track record of leading high-performance teams of researchers or engineers within ML infrastructure or foundational research.
In-depth technical knowledge in representation learning, embedding models, or vector retrieval systems.
Familiarity with transformer-based large language models and their interaction with embedding spaces and objectives.
Research experience in areas such as contrastive learning and retrieval-augmented generation.

