About the job
Join Our Innovative Team
At OpenAI, our Alignment team is committed to building AI systems that prioritize safety, trustworthiness, and alignment with human values, even as these systems evolve and grow in complexity. We are at the forefront of AI research, developing advanced methodologies to ensure that AI adheres to human intent across diverse scenarios, including high-stakes and adversarial environments. Our focus is on tackling the most critical challenges, addressing areas where AI can have profound impacts. By quantifying risks and making meaningful improvements, we aim to prepare our models for the complexities of real-world applications.
Our approach is built on two foundational pillars: (1) integrating enhanced capabilities into alignment, ensuring our techniques evolve positively with increasing capabilities, and (2) centering human input through the development of mechanisms that allow humans to communicate their intent and effectively monitor AI systems, even in intricate situations.
Your Role in Shaping the Future
As a Research Engineer / Scientist on our Alignment team, you will play a pivotal role in ensuring our AI systems align with human intent in complex and unpredictable contexts. Your responsibilities will include designing and implementing scalable solutions that maintain alignment as AI capabilities expand, while incorporating human oversight into AI decision-making processes.
This position is based in San Francisco, CA, and follows a hybrid work model of three days in the office each week. We also offer relocation assistance to new team members.
Key Responsibilities:
Develop and assess alignment capabilities that are context-sensitive, subjective, and challenging to quantify.
Create evaluations to accurately measure risks and alignment with human values and intentions.
Construct tools and evaluations to examine model robustness across various scenarios.
Design experiments to explore how alignment scales with compute resources, data, context lengths, actions, and adversarial influences.
Innovate new Human-AI interaction frameworks and scalable supervision methods that enhance human engagement and understanding of AI systems.

