About the job
About Us
At Preference Model, we are revolutionizing the future of AI by developing the next generation of training data. While current models demonstrate great power, their effectiveness is limited in diverse applications due to many tasks being out of distribution. We create reinforcement learning environments where models can face real-world research and engineering challenges, allowing them to iterate and learn via realistic feedback loops.
Our founding team, hailing from Anthropic's data team, has a rich background in building data infrastructure, tokenizers, and datasets that power Claude. We collaborate with leading AI labs to accelerate AI’s transformative potential and are proudly backed by a16z.
About the Role
We are looking for skilled Machine Learning Engineers to join our efforts in constructing distributed training infrastructure for our reinforcement learning initiatives. Your responsibilities will include:
- Designing and implementing scalable distributed training infrastructure utilizing PyTorch and Ray.
- Developing automation tools for monitoring, debugging, and recovery in distributed training environments.
- Ensuring the reliability, security, and performance of infrastructure to meet the high demands of large-scale machine learning workloads.
About You
We seek individuals with the following qualifications and traits:
Required Technical Skills:
- Experience in building and managing ML infrastructure at scale.
- Expertise in PyTorch and distributed training paradigms.
- Hands-on experience with Ray.
- Familiarity with at least one modern RL training framework such as verl, NeMo-RL, ART, Atropos, or similar.
- Proficiency in Python and systems programming.
- Experience with container orchestration tools (Kubernetes), and infrastructure as code methodologies (Terraform).
What Makes You Successful:
- Strong systems thinking with an ability to design for scalability.
- Exceptional debugging skills across the entire technology stack.
- A collaborative mindset and strong communication skills to effectively liaise with researchers and engineers.
- Self-motivated and capable of solving problems independently while taking ownership of projects.

