companyAnthropic logo

Research Engineer in Machine Learning - Reinforcement Learning

AnthropicLondon, UK
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

We are looking for candidates who have a strong foundation in machine learning, particularly in reinforcement learning. Ideal candidates should possess:A solid understanding of AI and machine learning principles. Experience with programming languages such as Python or similar. Familiarity with deep learning frameworks and libraries. Research experience in reinforcement learning or related fields. Excellent problem-solving skills and the ability to work collaboratively in a team environment.

About the job

About Anthropic

At Anthropic, we are dedicated to developing reliable, interpretable, and controllable AI systems. Our goal is to ensure that AI technology is safe and beneficial for both users and society. Our rapidly expanding team consists of passionate researchers, engineers, policy experts, and business leaders collaborating to create advantageous AI systems.

About the Teams

The Reinforcement Learning teams at Anthropic spearhead our research and development in reinforcement learning, playing an essential role in enhancing our AI systems. We have made significant contributions to all Claude models, particularly impacting the autonomy and coding capabilities of Claude Sonnet 4.5 and Opus 4.5. Our work encompasses several critical areas:

  • Creating systems that empower models to utilize computers effectively.
  • Enhancing code generation through reinforcement learning techniques.
  • Conducting pioneering RL research for large language models.
  • Establishing scalable RL infrastructure and training methodologies.
  • Improving model reasoning capabilities.

We work closely with Anthropic's alignment and frontier red teams to ensure our systems are both capable and secure. Additionally, we collaborate with the applied production training team to seamlessly integrate research advancements into deployed models, demonstrating our commitment to implementing research at scale. Our Reinforcement Learning teams operate at the intersection of cutting-edge research and engineering excellence, dedicated to building high-quality, scalable systems that expand the possibilities of AI.

About the Role

As a Research Engineer in the Reinforcement Learning domain, you will partner with a diverse group of researchers and engineers to enhance the capabilities and safety of large language models. This position merges research and engineering responsibilities, requiring you to implement innovative approaches while contributing to the research strategy. You will engage in fundamental research in reinforcement learning, developing 'agentic' models capable of tool use for open-ended tasks such as computer usage and autonomous software generation, improving reasoning skills in disciplines like mathematics, and creating prototypes for internal applications, productivity, and evaluation.

Representative Projects:

  • Design and optimize core reinforcement learning infrastructure, from clean training abstractions to distributed experiment management across GPU clusters, scaling our systems to manage increasingly complex research workflows.
  • Invent, implement, and evaluate novel training environments, evaluations, and methodologies for reinforcement learning.

About Anthropic

Anthropic is at the forefront of AI development, committed to creating systems that are not only powerful but also safe and interpretable. Our diverse team works tirelessly to push the boundaries of what AI can achieve, ensuring that our innovations benefit society as a whole.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.