Research Engineer, Frontier Evals & Environments

OpenAISan Francisco

On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.

Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

We are seeking candidates with a strong background in artificial intelligence, machine learning, or related fields, along with demonstrated experience in research engineering. Ideal applicants should possess proficiency in reinforcement learning, a solid understanding of AI evaluation methodologies, and the ability to work collaboratively in a fast-paced environment. A passion for advancing AI safety and ethics is essential.

About the job

About Our Team
Join the innovative Frontier Evaluations & Environments team at OpenAI, where we are dedicated to building transformative model environments that pave the way for safe artificial general intelligence (AGI) and artificial superintelligence (ASI). Our team constructs ambitious evaluation environments that not only measure but also enhance the capabilities of our models, creating self-improvement loops that inform our training, safety, and deployment strategies. Some of our notable open-source evaluations include GDPval, SWE-bench Verified, MLE-bench, PaperBench, and SWE-Lancer. We have also executed frontier evaluations for groundbreaking models like GPT4o, o1, o3, GPT 4.5, ChatGPT Agent, and GPT5. If you are passionate about experiencing firsthand the rapid advancements of our models and guiding them toward a positive impact, this is the opportunity for you.

Your Role
We are in search of exceptional research engineers who are eager to push the limits of our frontier models. Our ideal candidates will play a vital role in shaping our empirical understanding of AI capabilities across a broad spectrum and will take ownership of specific projects from conception to execution.

Key Responsibilities:

Design and implement ambitious reinforcement learning environments to maximize our models' potential.
Conduct assessments of frontier model capabilities, skills, and behaviors.
Create innovative methodologies for the automated exploration of model behaviors.
Guide training processes for our most extensive model training initiatives, gaining insights into the future of AI.
Collaborate with cross-functional teams to align model evaluations with organizational objectives.

About OpenAI

OpenAI is at the forefront of AI research and development, committed to creating safe and beneficial artificial general intelligence. Our mission is to ensure that AGI benefits all of humanity. We foster a collaborative and innovative environment where groundbreaking ideas can thrive, and our team members are empowered to make a lasting impact.

Similar jobs

1 - 20 of 5,458 Jobs

Search for Research Engineer In Economic Research

5,458 results

Select all on this page (20)

Apply

Research Engineer in Economic Research

Anthropic

Full-time|On-site|San Francisco, CA

Join Anthropic as a Research Engineer focusing on Economic Research. In this role, you will leverage your analytical skills to conduct in-depth economic analysis and contribute to innovative projects aimed at enhancing our understanding of economic models and their implications.

Mar 12, 2026

Apply

Research Operations Specialist in Economic Research

Anthropic

Full-time|Remote|San Francisco, CA

Join Anthropic as a Research Operations Specialist focused on Economic Research. In this role, you will facilitate the smooth execution of research projects and support our team in analyzing and interpreting economic data. Your contributions will play a crucial part in driving our mission to create safe and beneficial AI systems.

Mar 16, 2026

Apply

Research Economist, Economic Research

Anthropic

On-site|On-site|San Francisco, CA

Join Anthropic as a Research Economist, where you will play a pivotal role in measuring and analyzing the transformative effects of AI on the global economy. Your contributions will be crucial in developing the Anthropic Economic Index through innovative methodologies that track AI usage, diffusion, and impact across various sectors. You will employ advanced econometric techniques, machine learning, and structural estimation to drive significant insights that influence policy and product strategies. Collaborate with a diverse team to create groundbreaking datasets that illuminate AI's influence on labor markets, productivity, and economic shifts.

Jan 29, 2026

Apply

Research Engineer / Research Scientist, Post-Training

OpenAI

Full-time|Hybrid|San Francisco

About the TeamJoin the innovative Post-Training team at OpenAI, where we focus on refining and elevating pre-trained models for deployment in ChatGPT, our API, and future products. Collaborating closely with various research and product teams, we conduct crucial research that prepares our models for real-world deployment to millions of users, ensuring they are safe, efficient, and reliable.About the RoleAs a Research Engineer / Scientist, you will spearhead the research and development of enhancements to our models. Our work intersects reinforcement learning and product development, aiming to create cutting-edge solutions.We seek passionate individuals with robust machine learning engineering skills and research experience, particularly with innovative and powerful models. The ideal candidate will be driven by a commitment to product-oriented research.This position is located in San Francisco, CA, and follows a hybrid work model requiring three days in the office each week. Relocation assistance is available for new employees.In this role, you will:Lead and execute a research agenda aimed at enhancing model capabilities and performance.Work collaboratively with research and product teams to empower customers to optimize their models.Develop robust evaluation frameworks to monitor and assess modeling advancements.Design, implement, test, and debug code across our research stack.You may excel in this role if you:Possess a deep understanding of machine learning and its applications.Have experience with relevant models and methodologies for evaluating model improvements.Are adept at navigating large ML codebases for debugging purposes.Thrive in a fast-paced and technically intricate environment.About OpenAIOpenAI is a pioneering AI research and deployment organization dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We are committed to pushing the boundaries of AI capabilities while prioritizing safety and human-centric values in our products. Our mission is to embrace diverse perspectives, voices, and experiences that represent the full spectrum of humanity, as we strive for a future where AI is a powerful ally for everyone.

Dec 1, 2025

Apply

Research Engineer/Research Scientist, RL/Reasoning

OpenAI

Full-time|Hybrid|San Francisco

About Our TeamJoin the forefront of AI innovation with the RL and Reasoning team at OpenAI. Our team is dedicated to advancing reinforcement learning research and has pioneered transformative projects, including o1 and o3. We are committed to pushing the limits of generative models while ensuring their scalable deployment.About the RoleAs a Research Engineer/Research Scientist at OpenAI, you will play a pivotal role in enhancing AI alignment and capabilities through state-of-the-art reinforcement learning techniques. Your contributions will be essential in training intelligent, aligned, and versatile agents that power various AI models.We seek individuals with a solid foundation in reinforcement learning research, agile coding skills, and a passion for rapid iteration.This position is located in San Francisco, CA, and follows a hybrid work model of three days in the office per week. We also provide relocation assistance for new hires.You may excel in this role if:You are enthusiastic about being at the cutting edge of RL and language model research.You take initiative, owning ideas and driving them to fruition.You value principled methodologies, conducting simple experiments in controlled environments to draw trustworthy conclusions.You thrive in a fast-paced, complex technical environment where rapid iteration is essential.You are adept at navigating extensive ML codebases to troubleshoot and enhance them.You possess a profound understanding of machine learning and its applications.About OpenAIOpenAI is a pioneering AI research and deployment organization committed to ensuring that general-purpose artificial intelligence serves the greater good for humanity. We strive to push the boundaries of AI system capabilities while prioritizing safe deployment through our innovative products. We recognize AI as a powerful tool that must be developed with safety and human-centric principles, embracing diverse perspectives to reflect the full spectrum of humanity.We are proud to be an equal opportunity employer, welcoming applicants from all backgrounds without discrimination based on race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or any other legally protected characteristic.

May 14, 2025

Apply

Research Engineer / Research Scientist - Foundations Retrieval Lead

OpenAI

Full-time|Hybrid|San Francisco

About Our TeamJoin the Foundations Research team, where we tackle ambitious and innovative projects that could redefine the future of AI. Our mission is to enhance the science behind our training and scaling initiatives, focusing on pioneering frontier models. We are dedicated to advancing data utilization, scaling methodologies, optimization strategies, model architectures, and efficiency enhancements to accelerate our scientific breakthroughs.About the PositionWe are on the lookout for a dynamic technical research lead to spearhead our embeddings-focused retrieval initiatives. You will oversee a talented team of research scientists and engineers committed to developing foundational technologies that enable models to access and utilize the right information precisely when needed. This includes crafting innovative embedding training objectives, architecting scalable vector storage, and implementing adaptive indexing techniques.This pivotal role will contribute to various OpenAI products and internal research initiatives, offering opportunities for scientific publication and significant technical influence.This position is located in San Francisco, CA, where we embrace a hybrid work model, requiring three days in the office weekly, and we provide relocation assistance for new hires.Your ResponsibilitiesLead cutting-edge research on embedding models and retrieval systems optimized for grounding, relevance, and adaptive reasoning.Supervise a team of researchers and engineers in building an end-to-end infrastructure for training, evaluating, and integrating embeddings into advanced models.Drive advancements in dense, sparse, and hybrid representation techniques, metric learning, and retrieval systems.Work collaboratively with Pretraining, Inference, and other Research teams to seamlessly integrate retrieval throughout the model lifecycle.Contribute to OpenAI's ambitious vision of developing AI systems with robust memory and knowledge access capabilities rooted in learned representations.You Will Excel in This Role If You PossessA proven track record of leading high-performance teams of researchers or engineers within ML infrastructure or foundational research.In-depth technical knowledge in representation learning, embedding models, or vector retrieval systems.Familiarity with transformer-based large language models and their interaction with embedding spaces and objectives.Research experience in areas such as contrastive learning and retrieval-augmented generation.

Jun 16, 2025

Apply

Research Infrastructure Engineer for Training Systems

OpenAI

Full-time|On-site|San Francisco

OpenAI's research infrastructure group creates and maintains the backbone systems for advanced machine learning model training. This team often goes beyond conventional training methods, developing new infrastructure to support novel research at scale. Their work closely connects systems engineering with research progress, making it possible to run experiments that would otherwise be too slow or complex. Role overview The Research Infrastructure Engineer for Training Systems designs and improves the platforms that power large-scale ML training. This role bridges research concepts and the practical systems that make large model training possible. The work has a direct impact on model release timelines and requires building systems that perform reliably in demanding, real-world scenarios. What you will do Build and maintain infrastructure for large-scale model training and experimentation Design APIs and interfaces to simplify complex training workflows and prevent misuse Enhance reliability, debuggability, and performance across training and data pipelines Troubleshoot issues involving Python, PyTorch, distributed systems, GPUs, networking, and storage Create tests, benchmarks, and diagnostic tools to catch regressions early Requirements Interest in building systems that support new training methods, not just optimizing existing ones Strong instincts in systems engineering, especially regarding performance, reliability, and clean abstractions Experience designing APIs and interfaces for researchers and engineers Ability to work across ML research code and production infrastructure Enjoys evidence-based debugging using profiles, traces, logs, tests, and reproducible cases

Apr 27, 2026

Apply

Applied Research Engineer

fuku

Full-time|$200K/yr - $250K/yr|On-site|San Francisco, California, United States

Join fuku as an Applied Research Engineer in San Francisco, CA, where you will be at the forefront of AI video data research. As a crucial member of our team, your mission will involve building robust, high-performance frameworks and extensive pipelines to process and decode video data with exceptional accuracy. You will tackle complex research challenges, refine machine learning models and APIs, and deliver comprehensive solutions across computer vision, audio, and text processing domains. This role is designed for engineers who thrive in both research and production environments and are eager to spearhead the evolution of video understanding from research to deployment.

Dec 10, 2025

Apply

Research Engineer in Model Evaluations

Anthropic

Full-time|Remote|Remote-Friendly (Travel-Required) | San Francisco, CA | New York City, NY

Anthropic is looking for a Research Engineer focused on model evaluations. This position involves research and development to assess and strengthen the performance of AI models. Teams are based in San Francisco and New York City, and the role supports remote work with required travel. Key responsibilities Design and implement evaluations for Anthropic's AI models Collaborate with team members to enhance model performance Contribute to research that pushes the boundaries of AI systems Location Remote-friendly (travel required) San Francisco, CA New York City, NY

Apr 28, 2026

Apply

Mechanical Research Engineer

Gridware

Full-time|On-site|San Francisco, CA

Join Gridware as a Mechanical Research Engineer, where your innovative spirit and engineering expertise will contribute to groundbreaking projects in the energy sector. You will be responsible for conducting research, developing prototypes, and collaborating with a team of skilled engineers to advance our technology solutions.

Apr 9, 2026

Apply

Macro & Credit Research Economist

Plaid

Full-time|On-site|San Francisco

At Plaid, we envision a future where financial interactions are seamless and empowering. Our mission is to drive this transformation by developing innovative tools and experiences that empower thousands of developers to create their own financial products. With Plaid, millions of users enjoy enhanced financial health through easier connections to their financial accounts via apps and services they trust. We collaborate with a diverse range of clients, including Venmo, SoFi, numerous Fortune 500 companies, and leading banks, covering over 12,000 financial institutions across the US, Canada, the UK, and Europe. Established in 2013, our headquarters is in San Francisco, with additional offices in New York, Washington D.C., London, and Amsterdam.Join our Macro & Economic Insights team, where you will leverage Plaid's unique high-frequency transaction and balance data to generate actionable insights regarding household financial health, credit trends, and the overall economic landscape. Your role as a Macroeconomic and Credit Research Economist will involve creating and owning robust macroeconomic and credit indicators, developing forecasting models, pinpointing economic turning points, and delivering thorough analyses that guide both internal strategy and external research initiatives.

Jan 28, 2026

Apply

Research Engineer, Infrastructure

Cognition

Full-time|On-site|San Francisco Bay Area

Join our dynamic team at Cognition as a Research Engineer specializing in Infrastructure. In this role, you will be at the forefront of cutting-edge research, contributing to innovative solutions that shape the future of our infrastructure projects.Your responsibilities will include conducting thorough research, analyzing data, and collaborating with cross-functional teams to implement effective strategies. We are looking for an individual who is passionate about technology and infrastructure, eager to solve complex problems, and ready to drive impactful results.

Apr 8, 2026

Apply

Research Engineer, Frontier Evals & Environments

OpenAI

Full-time|On-site|San Francisco

About Our TeamJoin the innovative Frontier Evaluations & Environments team at OpenAI, where we are dedicated to building transformative model environments that pave the way for safe artificial general intelligence (AGI) and artificial superintelligence (ASI). Our team constructs ambitious evaluation environments that not only measure but also enhance the capabilities of our models, creating self-improvement loops that inform our training, safety, and deployment strategies. Some of our notable open-source evaluations include GDPval, SWE-bench Verified, MLE-bench, PaperBench, and SWE-Lancer. We have also executed frontier evaluations for groundbreaking models like GPT4o, o1, o3, GPT 4.5, ChatGPT Agent, and GPT5. If you are passionate about experiencing firsthand the rapid advancements of our models and guiding them toward a positive impact, this is the opportunity for you.Your RoleWe are in search of exceptional research engineers who are eager to push the limits of our frontier models. Our ideal candidates will play a vital role in shaping our empirical understanding of AI capabilities across a broad spectrum and will take ownership of specific projects from conception to execution.Key Responsibilities:Design and implement ambitious reinforcement learning environments to maximize our models' potential.Conduct assessments of frontier model capabilities, skills, and behaviors.Create innovative methodologies for the automated exploration of model behaviors.Guide training processes for our most extensive model training initiatives, gaining insights into the future of AI.Collaborate with cross-functional teams to align model evaluations with organizational objectives.

Apr 13, 2025

Apply

Electrical Research Engineer

Gridware

Full-time|On-site|San Francisco, CA

Join our innovative team at Gridware as an Electrical Research Engineer, where you will play a crucial role in advancing our cutting-edge technology. In this position, you will be responsible for conducting research, developing new electrical systems, and optimizing current technologies to enhance our product offerings.

Mar 19, 2026

Apply

Research Engineer

OpenAI

Full-time|On-site|San Francisco

Join us at OpenAI as a Research Engineer, where your innovative ideas will shape the future of artificial intelligence.About the RoleIn this pivotal position, you will be instrumental in developing cutting-edge AI systems that tackle challenges previously deemed insurmountable. We are seeking individuals with exceptional engineering capabilities, particularly in designing and enhancing large-scale distributed machine learning systems, writing efficient machine learning code, and advancing the scientific foundations of our algorithms.The most remarkable outcomes in deep learning are increasingly achieved at scale, necessitating engineers who thrive in expansive distributed systems. Your engineering expertise will be vital to driving significant advancements in AI technology.Key Responsibilities:Demonstrate strong programming and coding proficiencyPossess experience in managing and optimizing large distributed systemsExpress enthusiasm for OpenAI's innovative research methodologiesPreferred Qualifications:Exhibit a thoughtful perspective on the societal impacts of AI technologyBring prior experience in developing high-performance implementations of deep learning algorithmsAbout OpenAIOpenAI is at the forefront of AI research and application, committed to ensuring that general-purpose artificial intelligence serves the greater good of humanity. We strive to extend the limits of AI capabilities while prioritizing safety and human-centric design in our products. Our mission is to embrace diverse perspectives and experiences that enrich our understanding of humanity in the pursuit of our goals.We are proud to be an equal opportunity employer, welcoming applicants from all backgrounds without discrimination. For more information, please refer to OpenAI’s Affirmative Action and Equal Employment Opportunity Policy Statement.Background checks will be conducted in accordance with applicable laws.

Apr 5, 2025

Apply

Developer Relations Lead

Pluralis Research

Full-time|On-site|San Francisco

Pluralis Research is at the forefront of Protocol Learning—an innovative decentralized approach to training and deploying AI models that democratizes access to this technology for individuals, rather than just large corporations. By aggregating computing resources from numerous contributors, incentivizing participation, and ensuring no single entity can dominate the model's complete weights, we are forging a truly open and collaborative pathway to cutting-edge AI.Role OverviewWe are seeking a passionate Developer Relations Lead to serve as the crucial technical liaison between Pluralis's research initiatives and the broader machine learning and systems communities. In this role, you will transform complex, groundbreaking research (including distributed training, communication-efficient model parallelism, and fault-tolerant optimization) into clear, engaging, and accessible content for researchers, engineers, and innovators.This position is not merely a traditional marketing role. We are looking for an individual who can digest our research papers, grasp the underlying architecture, and convey these insights effectively through blog posts, conference presentations, or social media updates. You will shape our technical narrative and become the face of Pluralis's contributions within the community.

Mar 25, 2026

Apply

Research Lead

abundant

Full-time|Remote|San Francisco

abundant seeks a Research Lead based in San Francisco. This position steers research activities that help shape the company’s direction. The Research Lead partners with colleagues to analyze data, draw meaningful insights, and support projects where research has a clear business impact. Key responsibilities Plan, manage, and execute research initiatives from start to finish Work with team members to analyze data and spot important trends Turn research results into practical recommendations for the business Support projects that guide company strategy Collaboration and impact This role involves close teamwork and communication across departments. Research findings directly inform business decisions and contribute to the company’s ongoing growth.

Apr 24, 2026

Apply

Applied Research Engineering Intern

Sieve

Full-time|On-site|San Francisco

Sieve is a 15-person AI research lab in San Francisco focused on video data. The team builds exabyte-scale video infrastructure and develops new approaches for video understanding, drawing from diverse data sources to create advanced datasets. With video now accounting for most internet traffic, Sieve aims to solve the challenge of delivering high-quality training data for applications in creativity, communication, gaming, AR/VR, and robotics. The company partners with leading AI labs and has achieved strong financial results, backed by Series A funding from Matrix Partners, Swift Ventures, Y Combinator, and AI Grant. Internship overview The Applied Research Engineering Intern will help build high-performance components and large-scale pipelines to advance video understanding at internet scale. This role involves tackling ambiguous research problems and turning them into practical solutions. Projects often cover computer vision, audio processing, and text processing. What you will do Develop and optimize models and APIs for video, audio, and text data Improve performance through pre- and post-processing, parallelism, pipelining, and inference optimization Occasionally fine-tune models for specific tasks Work through open-ended research challenges with a small, focused team Who succeeds here Comfortable working with machine learning models and APIs Skilled at optimizing systems for speed and accuracy Enjoys solving ambiguous technical problems across computer vision, audio, and text domains

Apr 28, 2026

Apply

Machine Learning Engineer - Decentralized ML Training Platform

Pluralis Research

Full-time|On-site|San Francisco

OverviewPluralis Research is at the forefront of Protocol Learning, innovating a decentralized approach to train and deploy AI models that democratizes access beyond just well-funded corporations. By aggregating computational resources from diverse participants, we incentivize collaboration while safeguarding against centralized control of model weights, paving the way for a truly open and cooperative environment for advanced AI.We are seeking a talented Machine Learning Training Platform Engineer to design, develop, and scale the core infrastructure that powers our decentralized ML training platform. In this role, you will have ownership over essential systems including infrastructure orchestration, distributed computing, and service integration, facilitating ongoing experimentation and large-scale model training.ResponsibilitiesMulti-Cloud Infrastructure: Create resource management systems that provision and orchestrate computing resources across AWS, GCP, and Azure using infrastructure-as-code tools like Pulumi or Terraform. Manage dynamic scaling, state synchronization, and concurrent operations across hundreds of diverse nodes.Distributed Training Systems: Design fault-tolerant infrastructure for distributed machine learning, including GPU clusters, NVIDIA runtime, S3 checkpointing, large dataset management and streaming, health monitoring, and resilient retry strategies.Real-World Networking: Develop systems that simulate and manage real-world network conditions—such as bandwidth shaping, latency injection, and packet loss—while accommodating dynamic node churn and ensuring efficient data flow across workers with varying connectivity, as our training occurs on consumer nodes and non-co-located infrastructure.

Apr 1, 2026

Apply

Machine Learning Research Scientist / Research Engineer - Post-Training

Scale AI

Full-time|$252K/yr - $315K/yr|On-site|San Francisco, CA; Seattle, WA; New York, NY

At Scale AI, we collaborate with leading AI laboratories to supply high-quality data and foster advancements in Generative AI research. We seek innovative Research Scientists and Research Engineers with a strong focus on post-training techniques for Large Language Models (LLMs), including Supervised Fine-Tuning (SFT), Reinforcement Learning from Human Feedback (RLHF), and reward modeling. This position emphasizes optimizing data curation and evaluation processes to boost LLM performance across text and multimodal formats. In this pivotal role, you will pioneer new methods to enhance the alignment and generalization of extensive generative models. You will work closely with fellow researchers and engineers to establish best practices in data-driven AI development. Additionally, you will collaborate with top foundation model labs, providing critical technical and strategic insights for the evolution of next-generation generative AI models.

Mar 26, 2026

Create account — see all 5,458 results