About the job
Join Scale Labs as a Research Scientist — Agent Robustness
Scale is the premier partner for data and evaluation within the forefront of AI innovation, playing a crucial role in understanding and safeguarding AI models and systems. Building on our extensive expertise, Scale Labs has initiated a dedicated team focused on policy research, aiming to connect AI research with global policymakers to facilitate informed, scientifically grounded decisions regarding AI risks and capabilities.
Our research addresses complex challenges in agent robustness, AI control protocols, and AI risk evaluations, empowering governments, industries, and the public to comprehend and mitigate AI risks while promoting AI adoption. This team collaborates across various sectors, including industry, public services, and academia, and regularly disseminates our findings. We are actively inviting skilled researchers to contribute to this vision.
As a Research Scientist specializing in Agent Robustness, you will tackle foundational challenges in creating AI agents that are both safe and aligned with human values. Your responsibilities may include:
- Investigating the science behind AI agent capabilities, focusing on safety, risk factors, and benchmarking methodologies.
- Designing and building testing harnesses to evaluate AI agents' tendencies to engage in harmful actions under user pressure or environmental manipulation.
- Creating exploits and mitigations for new failure modes that emerge as AI agents gain capabilities such as coding, web browsing, and computer usage.
- Characterizing and developing mitigations for potential failure modes or broader risks involving multiple interacting AI agents.

