companyScale AI logo

Evaluations Engineer, Applied AI

Scale AISan Francisco, CA; New York, NY
On-site Full-time $179.4K/yr - $224.3K/yr

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

Key Responsibilities Collaborate with Scale’s Operations team and enterprise clients to convert ambiguity into structured evaluation data, facilitating the development and upkeep of gold-standard human-rated datasets and expert rubrics that form the basis of AI evaluation systems. Examine feedback and gathered data to discover patterns, enhance evaluation frameworks, and establish iterative improvement cycles that elevate the quality and relevance of human-curated assessments. Design, research, and develop LLM-as-a-Judge autorater frameworks and AI-assisted evaluation systems, including models that critique, grade, and elucidate agent outputs (e.g., RLAIF, model-judging-model configurations), along with scalable evaluation pipelines and diagnostic tools. Engage in research projects that investigate new methodologies for the automatic analysis, evaluation, and enhancement of enterprise agent behavior, striving to advance how AI systems are assessed and optimized in practical applications. Basic Qualifications Bachelor’s degree in Computer Science, Electrical Engineering, or a related field, or equivalent practical experience. Over 2 years of experience in Machine Learning or Applied Research, with a focus on applied ML systems or evaluation infrastructure. Hands-on experience with Large Language Models (LLMs) and Generative AI in professional or research settings. Strong comprehension of cutting-edge model evaluation methodologies and the current research landscape. Proficiency in Python and major ML frameworks (e.g., PyTorch, TensorFlow). Solid engineering...

About the job

Join Scale AI as a passionate and technically adept AI Research Engineer within our Enterprise Evaluations team. This pivotal role is integral to our goal of providing the industry's leading Generative AI Evaluation Suite. You will actively contribute to the foundational systems that guarantee the safety, dependability, and ongoing enhancement of LLM-driven workflows and agents for enterprise clients.

The perfect candidate will possess a robust understanding of large language models, a fervor for addressing intricate evaluation dilemmas, and the ability to excel in a fast-evolving research atmosphere. We seek an engineer who can innovate, remains informed about the latest studies in AI evaluation, and is enthusiastic about incorporating cutting-edge research concepts into our workflows to create top-tier evaluation systems.

About Scale AI

Scale AI is at the forefront of AI-driven solutions, dedicated to streamlining operations and enhancing business intelligence through innovative technologies. With a commitment to excellence, we aim to empower enterprises with robust evaluation systems and insights that drive informed decision-making.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.