About the job
Join Arena Intelligence as a Data Scientist
Arena Intelligence stands at the forefront of AI evaluation, offering an open platform that examines how AI models perform in real-world scenarios. Founded by UC Berkeley's SkyLab researchers, our mission is to push the boundaries of AI utility.
Each month, millions engage with Arena Intelligence to assess the performance of pioneering AI systems, using our community's insights to foster transparent, comprehensive, and human-centric model evaluations. Leading enterprises and AI labs depend on our evaluations to gauge real-world reliability, alignment, and impact. Our leaderboards are regarded as the benchmark for AI performance, trusted by industry leaders and influencing global discussions on model reliability and advancement.
Our diverse team of researchers, engineers, and builders hail from prestigious institutions such as UC Berkeley, Google, Stanford, DeepMind, and Discord. We prioritize truth, agility, and craftsmanship while fostering an environment that values curiosity and impact over hierarchy. At Arena, skilled individuals from all backgrounds are empowered to excel in their fields, contributing to an atmosphere rich in excellence, energy, and focus.
The Role
As a Data Scientist, you will investigate and interpret the data that fuels millions of AI evaluations weekly. Your responsibilities will include generating and testing hypotheses, identifying causal relationships, and revealing insights that enhance our understanding of frontier model behaviors in practical applications. You will collaborate with machine learning researchers and engineers to design experiments, analyze extensive datasets, and develop statistical frameworks aimed at refining the reliability and interpretability of our AI evaluation systems. Senior-level candidates are preferred for this role.

