companyAfterQuery logo

Research Scientist - Frontier Data

AfterQuerySan Francisco
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Entry Level

Qualifications

Candidates should possess the following qualifications:Experience as an undergraduate or master's research student, with a PhD not required. Experience or internships in RL environments or with AI safety and benchmarking organizations (e.g., METR, Artificial Analysis) are highly valued. A strong enthusiasm for exploring how data structure, selection, and quality influence model behavior. Ability to design and execute lightweight experiments and derive insights from complex data. Flexibility to work across various domains including finance, software engineering, and policy. Strong quantitative instincts and familiarity with LLM training pipelines, as well as RLHF/RLVR methodologies. A proactive approach to building solutions rather than merely theorizing.

About the job

About AfterQuery

AfterQuery partners with leading AI labs to advance training data and evaluation frameworks. The team builds high-signal datasets and runs thorough evaluations that go beyond standard benchmarks. As a post-Series A, early-stage company in San Francisco, AfterQuery gives each team member room to shape the future of AI models.

Role Overview: Research Scientist - Frontier Data

This role focuses on designing datasets and developing evaluation systems that influence how top AI models are trained and assessed. Working closely with research teams at major AI labs, the scientist explores new data collection techniques, investigates where models fall short, and sets up metrics to track progress. The work is hands-on and experimental, moving quickly from hypothesis to live testing and directly impacting large-scale model training.

Key Responsibilities

  • Design data slides and analyze data structures to uncover model weaknesses in areas like finance, software development, and enterprise operations.
  • Build and refine evaluation rubrics and reward signals for RLHF and RLVR training approaches.
  • Study annotator behavior and run experiments to improve model capabilities across different domains.
  • Develop quantitative frameworks to measure dataset quality, diversity, and their effect on model alignment and performance.
  • Work with research teams to turn training objectives into concrete data and evaluation needs.

What We Look For

  • Experience as an undergraduate or master’s research student (PhD not required).
  • Background or internships with RL environments or AI safety and benchmarking organizations (e.g., METR, Artificial Analysis) is a strong plus.
  • Genuine interest in how data structure, selection, and quality affect model outcomes.
  • Demonstrated skill in designing experiments, acting quickly, and extracting insights from complex data.
  • Comfort working across sectors such as finance, software engineering, and policy.
  • Strong quantitative background and familiarity with LLM training pipelines, RLHF/RLVR methods, or evaluation frameworks.
  • A hands-on mindset focused on building practical solutions.

About AfterQuery

AfterQuery is at the forefront of creating training data and evaluation infrastructure that empowers leading AI laboratories. By collaborating with these top organizations, we are dedicated to designing high-quality datasets and conducting thorough evaluations that surpass traditional benchmarks. As a small, innovative team, we are passionate about making a significant impact on the future of AI model development.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.