Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.
Unlock Your Potential
Generate Job-Optimized Resume
One Click And Our AI Optimizes Your Resume to Match The Job Description.
Is Your Resume Optimized For This Role?
Find Out If You're Highlighting The Right Skills And Fix What's Missing
Experience Level
Experience
Qualifications
Candidates should have a proven ability to conduct high-level research in machine learning and robotics, with skills that encompass both theoretical understanding and practical application. A strong background in system development, combined with innovative thinking in algorithm design, is essential. We seek individuals who can contribute to our team's collaborative spirit and commitment to pushing the boundaries of technology.
About the job
Join us at Physical Intelligence as a Research Scientist, where you will be at the forefront of innovation in machine learning and robotics. We are in search of exceptional researchers across all experience levels who demonstrate a strong track record of impactful research results. Ideal candidates will possess a solid foundation in both practical implementation and theoretical frameworks, showcasing a blend of system-building capabilities and significant conceptual, algorithmic, or theoretical advancements. We value diverse backgrounds and encourage applications from both traditional academic researchers and those with unique, unconventional experiences.
We are committed to fostering a diverse and inclusive workplace. In accordance with the San Francisco Fair Chance Ordinance, we welcome applications from qualified individuals with arrest and conviction records.
About Physical Intelligence
Physical Intelligence is dedicated to pioneering advancements in machine learning and robotics, creating solutions that enhance physical interactions with technology. Our team is passionate about driving innovation and fostering a culture of research excellence.
Similar jobs
1 - 20 of 2,106 Jobs
Search for Staff Machine Learning Research Scientist Llm Evaluations
Full-time|$280K/yr - $380K/yr|On-site|San Francisco, CA; Seattle, WA; New York, NY
At Scale AI, we are the premier partner for data and evaluation in the rapidly evolving field of artificial intelligence. Our commitment to advancing the assessment and benchmarking of large language models (LLMs) positions us at the forefront of AI innovation. We are dedicated to creating leading-edge LLM evaluation methodologies that set new benchmarks for model performance. Our research teams collaborate with the top AI laboratories in the industry to provide high-quality data, accelerate progress in generative AI research, and inform what excellence looks like in this domain. As a Staff Machine Learning Research Scientist on our LLM Evals team, you will spearhead the creation of novel evaluation methodologies, metrics, and benchmarks to assess the strengths and weaknesses of cutting-edge LLMs. Your work will shape our internal strategies and influence the broader AI research community, making this role essential for establishing best practices in data-driven AI development.
Full-time|$280K/yr - $380K/yr|On-site|San Francisco, CA; Seattle, WA; New York, NY
As a premier data and evaluation partner for cutting-edge AI firms, Scale AI is committed to enhancing the evaluation and benchmarking of large language models (LLMs). We are developing industry-leading LLM evaluations that set new benchmarks for model performance assessment. Our mission is to create rigorous, scalable, and equitable evaluation methodologies that propel the next evolution of AI capabilities.Our Research teams collaborate with top AI laboratories to provide high-quality data and expedite advancements in Generative AI research. As the Tech Lead/Manager of the LLM Evaluations Research team, you will guide a skilled team of research scientists and engineers dedicated to crafting and applying innovative evaluation methodologies, metrics, and benchmarks that assess the strengths and weaknesses of our advanced LLMs. This pivotal role involves designing and executing a strategic roadmap that establishes best practices in data-driven AI development, thus accelerating the development of the next generation of generative AI models in collaboration with leading foundational model labs.
Full-time|$275K/yr - $350K/yr|On-site|San Francisco, CA; Seattle, WA; New York, NY
About Scale AI At Scale AI, we are dedicated to propelling the advancement of AI applications. Over the past eight years, we have established ourselves as the premier AI data foundry, supporting groundbreaking innovations in fields such as generative AI, defense technologies, and autonomous vehicles. Following our recent Series F funding round, we are intensifying our efforts to harness frontier data, paving the way toward achieving Artificial General Intelligence (AGI). Our work with enterprise clients and governments has enhanced our model evaluation capabilities, allowing us to expand our offerings for both public and private evaluations. About the ACE Team The Agent Capabilities & Environments (ACE) team, a vital part of Scale’s Research organization, unites customer-focused Researchers and Applied AI Engineers. Our primary mission is to conduct research on agent environments and reinforcement learning reward signals, benchmark autonomous agent performance in real-world contexts, and develop robust data programs aimed at enhancing the capabilities of Large Language Models (LLMs). We are committed to creating foundational tools and frameworks for evaluating models as agents, focusing on autonomous agents that interact dynamically with a wide range of external environments, including code repositories and GUI interfaces. About This Role This position sits at the cutting edge of AI research and its practical applications, concentrating on the data types necessary for the development of state-of-the-art agents, including browser and software engineering agents. The ideal candidate will investigate the data landscape required to propel intelligent and adaptable AI agents, steering the data strategy at Scale to foster innovation. This role demands not only expertise in LLM agents and planning algorithms but also creative problem-solving skills to tackle novel challenges pertaining to data, interaction, and evaluation. You will contribute to influential research publications on agents, collaborate with customer researchers, and partner with the engineering team to transform these advancements into scalable real-world solutions.
About Retell AI Retell AI builds voice AI technology that helps businesses transform their call center operations. In just 18 months, thousands of companies have adopted Retell’s AI voice agents to streamline sales, support, and logistics, work that once required large human teams. Backed by investors including Y Combinator and Alt Capital, Retell has grown annual recurring revenue from $5M to $36M with a focused team of 20. The company’s goal for 2026: a modern customer experience platform where AI powers entire contact centers. Retell is developing AI “workers” that can serve as frontline agents, quality assurance analysts, and managers, handling, evaluating, and improving customer interactions on their own. Named a top 50 AI app by a16z: https://tinyurl.com/5853dt2x Ranked #4 on Brex’s Fast-Growing Software Vendors of 2025: https://www.brex.com/journal/brex-benchmark-december-2025 Featured on the Lean AI Leaderboard: https://leanaileaderboard.com/ Role Overview: Research Scientist – LLM Retell AI is hiring a Research Scientist focused on large language models (LLMs) and audio processing. This role suits machine learning researchers who want to push the boundaries of real-time AI and see their work in production. What You Will Do Investigate new approaches in large language models and audio processing for human-like voice agents Design and implement evaluation methods for complex, real-world conversational systems Prototype systems to improve reasoning, reduce latency, and enhance conversation quality Work closely with engineering and product teams to bring research advances into production Impact Research at Retell directly shapes the capabilities of voice AI agents for thousands of businesses. The work blends advanced research with practical deployment, improving how customers interact with automated systems across industries. Location This position is based in the San Francisco Bay Area.
Bland Inc. seeks a Machine Learning Researcher specializing in Multimodal Large Language Models (LLMs) to join the team in San Francisco. The focus is on advancing AI systems that integrate language with other types of data. Role overview This position centers on research and development aimed at improving how AI models process and understand information from multiple sources, such as text combined with images or other modalities. What you will do Investigate how language interacts with additional data types within multimodal LLMs Create and evaluate new methods to enhance AI model performance Work closely with colleagues on projects designed to push the boundaries of machine learning Location This role is based in San Francisco.
Join gleanwork as a Machine Learning Engineer specializing in LLM evaluations and observability. In this role, you will be instrumental in developing cutting-edge machine learning systems that enhance our understanding and effectiveness of language learning models. You will collaborate with cross-functional teams to drive the integration of advanced analytics and machine learning solutions.
Full-time|$176K/yr - $304K/yr|Hybrid|Cambridge, MA USA; San Francisco, CA USA
Your Contribution at LilaAs a Machine Learning Research Scientist I/II specializing in LLM Inference, you will spearhead research initiatives focused on the training and deployment of large language models for scientific applications.Your ResponsibilitiesDevelop and refine post-training strategies for LLMs, including Supervised Fine-Tuning (SFT), Reinforcement Learning from Human Feedback (RLHF), and Reinforcement Learning with verifiers.Design efficient inference mechanisms and compute strategies for complex tool utilization in various environments.Create scalable evaluation metrics to assess LLM performance in scientific reasoning tasks.Investigate the boundaries of cutting-edge LLM methodologies for scientific challenges and analyze their limitations.
Full-time|$252K/yr - $315K/yr|On-site|San Francisco, CA; Seattle, WA; New York, NY
At Scale AI, we collaborate with leading AI laboratories to supply high-quality data and foster advancements in Generative AI research. We seek innovative Research Scientists and Research Engineers with a strong focus on post-training techniques for Large Language Models (LLMs), including Supervised Fine-Tuning (SFT), Reinforcement Learning from Human Feedback (RLHF), and reward modeling. This position emphasizes optimizing data curation and evaluation processes to boost LLM performance across text and multimodal formats. In this pivotal role, you will pioneer new methods to enhance the alignment and generalization of extensive generative models. You will work closely with fellow researchers and engineers to establish best practices in data-driven AI development. Additionally, you will collaborate with top foundation model labs, providing critical technical and strategic insights for the evolution of next-generation generative AI models.
At Causal Labs, we are on a groundbreaking mission to develop general causal intelligence—artificial intelligence that not only predicts future events but also determines the most effective actions to influence those outcomes.To achieve this monumental goal, we are constructing a Large Physics Foundation Model (LPM). Our focus is on domains governed by physical laws, which inherently exhibit cause-and-effect relationships, setting them apart from traditional visual or textual data.Weather serves as the ideal training environment for our LPM, being one of the most extensively observed physical systems available. It provides immediate, objective feedback from sensory observations and boasts data scales significantly larger than those currently employed to train existing language models.Our team at Causal Labs includes leading researchers and engineers with backgrounds in self-driving technology, drug discovery, and robotics, hailing from prestigious organizations such as Google DeepMind, Cruise, Waymo, Meta, Nabla Bio, and Apple. We firmly believe that achieving general causal intelligence will represent one of the most critical technological advancements for our civilization.We are seeking innovative researchers eager to confront unsolved challenges in the field.This role presents an opportunity to create powerful models rooted in observable feedback and verifiable ground truths. If you possess experience in pioneering research and training large-scale models from the ground up in areas such as language and vision models, robotics, or biology, we invite you to join our mission.
Join Handshake as a Machine Learning Research Scientist and contribute to groundbreaking projects that leverage advanced algorithms and data analysis to drive innovation. In this role, you will collaborate with a dynamic team to design, implement, and evaluate machine learning models that enhance our products and services. Your expertise will be pivotal in unlocking new insights from data, improving user experiences, and shaping the future of our technology.
Full-time|$197.4K/yr - $246.8K/yr|On-site|San Francisco, CA; New York, NY
Join Scale AI as a Research Scientist — Frontier Risk EvaluationsAt Scale AI, we are at the forefront of data and evaluation services for pioneering AI technologies. Our mission is to ensure the safe and effective deployment of AI systems by bridging the gap between advanced AI research and global policy frameworks. With the launch of Scale Labs, we are assembling a dedicated team focused on policy research to empower governments and industry leaders with scientific insights regarding AI risks and functionalities.This team addresses complex challenges in agent robustness, AI control mechanisms, and risk assessments to facilitate a comprehensive understanding of AI risks, while promoting its responsible adoption across various sectors. We are eager to welcome skilled researchers who are passionate about shaping the future of AI.As a Research Scientist specializing in Frontier Risk Evaluations, you will be responsible for designing evaluation metrics, harnesses, and datasets to assess the risks associated with cutting-edge AI systems. Your role may involve:Developing harnesses to evaluate AI models for potential security vulnerabilities and other high-risk behaviors.Collaborating with government entities and research labs to design evaluations that mitigate risks posed by advanced AI technologies.Publishing evaluation methodologies and drafting technical reports aimed at informing policymakers.
Join us at Physical Intelligence as a Research Scientist, where you will be at the forefront of innovation in machine learning and robotics. We are in search of exceptional researchers across all experience levels who demonstrate a strong track record of impactful research results. Ideal candidates will possess a solid foundation in both practical implementation and theoretical frameworks, showcasing a blend of system-building capabilities and significant conceptual, algorithmic, or theoretical advancements. We value diverse backgrounds and encourage applications from both traditional academic researchers and those with unique, unconventional experiences.We are committed to fostering a diverse and inclusive workplace. In accordance with the San Francisco Fair Chance Ordinance, we welcome applications from qualified individuals with arrest and conviction records.
Join Arena Intelligence as a Machine Learning ScientistAt Arena Intelligence, we are revolutionizing how AI models are evaluated in real-world scenarios. Founded by innovative researchers from UC Berkeley’s SkyLab, our mission is to push the boundaries of AI evaluation and ensure its practical application.With millions of users engaging with our platform each month, we prioritize community feedback to develop transparent, rigorous, and human-centered model evaluations. Our leaderboards serve as the benchmark for AI performance, gaining the trust of leading enterprises and AI labs to understand the reliability, alignment, and impact of AI systems.Our diverse team comprises experts from esteemed institutions such as UC Berkeley, Google, Stanford, DeepMind, and Discord. We foster a culture that values truth, agility, craftsmanship, curiosity, and impact over hierarchy. We are committed to creating an environment where talented individuals from all backgrounds can excel in their work.Role OverviewWe are seeking a passionate Machine Learning Scientist to spearhead our open-source research initiatives, including the development of open datasets and code releases. You will be instrumental in advancing how AI models are evaluated and understood globally.In this position, you will operationalize our dedication to openness by curating impactful datasets, developing innovative methodologies, and establishing reproducible benchmarks. Your contributions will enhance our public leaderboards, empower community tools, and promote transparency in AI evaluation on a global scale.This interdisciplinary role involves collaboration with engineers, product teams, marketing, and the broader research community to refine model comparisons, analyze preference data, and explore dimensions like style, reasoning, and robustness. You will also work closely with our go-to-market teams to advocate for our open research initiatives, strengthen research partnerships, and encourage community engagement.If you are excited by complex challenges, rigorous evaluation processes, and scientific outreach, we invite you to apply!
Join Our Team at Rad AIAt Rad AI, we are dedicated to transforming the healthcare landscape through the power of artificial intelligence. Established by a radiologist, our innovative AI solutions are revolutionizing the field of radiology, enhancing patient care, alleviating clinician burnout, and significantly reducing the time required for report generation. With access to one of the largest proprietary datasets of radiology reports globally, our AI technologies have facilitated the discovery of numerous new cancer diagnoses and halved error rates across tens of millions of reports.Having raised over $140 million in funding, including a highly successful Series C round of $68 million led by Transformation Capital, we are now valued at $528 million. Our prestigious investors, such as Khosla Ventures, World Innovation Lab, Gradient Ventures, and Cone Health Ventures, are all committed to supporting our mission to empower healthcare professionals with cutting-edge AI tools.Our latest breakthroughs in generative AI are utilized by thousands of radiologists every day, supporting nearly half of all medical imaging across the United States, in partnership with esteemed healthcare organizations like Cone Health, Jefferson Einstein Health, Geisinger, Guthrie Healthcare System, and Henry Ford Health.Recognized as one of the most promising healthcare AI companies by CB Insights and AuntMinnie, and ranked as the 19th fastest-growing company in North America by Deloitte, we are committed to creating AI-powered solutions that make a real difference. Recently, Rad AI was also featured on CNBC’s Disruptor 50 list, showcasing the innovation and momentum behind our mission.If you are eager to impact the future of healthcare positively, we would be thrilled to have you join our talented team!Why You Should Join UsWe are seeking a Staff Machine Learning Research Scientist to define and lead Rad AI's next wave of applied research in Natural Language Processing (NLP) and clinical AI. You will engage with large language models, retrieval systems, representation learning, speech processing, and multimodal modeling, prioritizing evaluation and reliability alongside achieving state-of-the-art outcomes. This role offers you the chance to take ownership of projects and establish a direct pathway from research to product implementation.As part of our team, you will work closely with clinicians, engineers, and product leaders to translate foundational research into practical applications that enhance healthcare delivery.
Full-time|$150K/yr - $150K/yr|On-site|San Francisco
Become a Pioneer in Sleep FitnessAt Eight Sleep, we're dedicated to unlocking human potential through optimal sleep. As the world's first sleep fitness company, we are revolutionizing what it means to be well-rested by creating the most advanced hardware, software, and AI technology. Our innovative products enhance mental, physical, and emotional performance by transforming each night into a personalized, data-driven recovery journey. Trusted by high achievers, professional athletes, and health-conscious individuals across over 30 countries, we have been recognized by Fast Company as one of the Most Innovative Companies in 2019, 2022, and 2023, and honored twice by TIME's “Best Inventions of the Year.” Our high-performance team operates with speed, focus, and a commitment to impact. We don't just create; we refine and obsess over every detail to ensure our members sleep better and wake up stronger.Every position at Eight Sleep offers the opportunity to innovate cutting-edge technology, collaborate with exceptional talent, and contribute to a future where sleep is a powerful tool for well-being. If you're ready to break away from the ordinary and eager to build at the forefront of possibility, this is your chance to join us in reshaping how the world sleeps and what we can achieve upon waking.Our Culture: High Standards, No CompromiseOur mission demands intensity, and at Eight Sleep, we embody the mindset of the world's top performers: focused, relentless, and committed to being in the top 1% of our field. Inspired by the relentless drive of legends like Kobe Bryant, we apply that mentality to bold ideas, next-gen technology, and impeccable execution. This is not a standard 9-to-5 role; our team is dedicated, often working 60+ hours per week—not out of obligation, but out of passion. If you thrive under pressure and seek to do the most meaningful work of your career, you'll find a home here. If you prefer an easier path, this position is not for you.Your RoleAs a Machine Learning Research Scientist at Eight Sleep, you will be at the cutting edge of sleep innovation. Your mission will be to leverage innovative technology, minimalistic design, and proven clinical science to personalize and enhance sleep experiences, fundamentally changing how people sleep for the better.Our revolutionary temperature-regulated technology, the Pod, has been recognized as a game changer, enhancing health and happiness by transforming sleep. Join us in making sleep count for more.
Full-time|$252K/yr - $315K/yr|On-site|San Francisco, CA; Seattle, WA; New York, NY
About Scale AI At Scale AI, we are committed to propelling the advancement of AI technologies. For over eight years, we have been a pioneer in the AI data sector, supporting groundbreaking innovations in areas such as generative AI, defense solutions, and autonomous driving. Following our recent Series F funding round, we are enhancing access to premium data to accelerate the journey towards Artificial General Intelligence (AGI). Building on our legacy of model evaluation for both enterprise and governmental clients, we are expanding our capabilities to establish new benchmarks for evaluations in both public and private domains. About This Role This position is at the leading edge of AI research and practical implementation, concentrating on reasoning within large language models (LLMs). The successful candidate will investigate critical data types vital for evolving LLM-based agents, including browser and software engineering agents. You will significantly influence Scale’s data strategy by pinpointing optimal data sources and methodologies to enhance LLM reasoning. To excel in this role, you will require a profound understanding of LLMs, planning algorithms, and fresh approaches to agentic reasoning, alongside inventive solutions to challenges in data generation, model interaction, and evaluation. Your contributions will lead to transformative research on language model reasoning, facilitate collaboration with external researchers, and engage closely with engineering teams to translate cutting-edge advancements into scalable, real-world applications.
At Merge Labs, we are at the forefront of research, dedicated to uniting biological and artificial intelligence to enhance human capability, autonomy, and overall experience. Our innovative approach focuses on developing revolutionary brain-computer interfaces that offer high-bandwidth interaction with the brain, seamlessly integrate advanced AI, and are designed to be safe and accessible for everyone.About the Team:Our Bio team is responsible for designing, constructing, and characterizing the biotechnologies that underpin the next generation of brain-computer interfaces. By integrating molecular engineering, synthetic biology, neuroscience, and cutting-edge physical methods such as ultrasound, we aim to establish less invasive, high-bandwidth connections with neurons. The Bio team is dedicated to developing our core molecular technologies, validating their performance both in vitro and in vivo, and showcasing their advanced capabilities in animal models. We create custom experimental setups and pipelines while collaborating closely with engineers and data scientists to tackle some of the most challenging problems in biotechnology.About the Role:We are seeking a Senior/Principal Machine Learning Biophysicist to spearhead the creation of scalable molecular dynamics pipelines, integrating physics-based models with machine learning frameworks. You will build the molecular modeling foundations of the company from first principles, establishing tools and workflows for simulating, analyzing, and interpreting biomolecular dynamics to elucidate function relationships. Over time, your contributions will help translate these frameworks into predictive models that expedite molecular engineering, guide experimental campaigns, and facilitate the discovery of highly functional molecules.Key Responsibilities:Develop the scientific and engineering framework for protein structure modeling and molecular dynamics, along with integrations into downstream ML frameworks.Collaborate with wet-lab scientists to establish realistic optimization objectives and encode domain-specific priors and constraints.Prototype modeling frameworks utilizing internal and public datasets; benchmark and validate performance.Make complex analyses accessible to non-domain experts through democratization of first-principles analysis.Lead the development of ML frameworks that explicitly incorporate first-principles priors.Stay abreast of the latest advancements in deep learning and molecular dynamics.
About Wispr FlowAt Wispr Flow, we strive to make device interaction as seamless as conversing with a friend.Wispr Flow has revolutionized voice dictation, now preferred by users over traditional keyboards due to its unparalleled accuracy on the first attempt. Our platform is context-aware, personalized, and effective across all devices, whether desktop or mobile.By 2026, we aim to expand beyond dictation to develop native actions within an agentic framework that comprehends and responds to user needs reliably.Our diverse team comprises AI researchers, designers, growth specialists, and engineers dedicated to reimagining human-computer interaction. We value team members who prioritize open communication, exhibit a user-centric mindset, and pay meticulous attention to detail. Our collaborative environment fosters spirited discussions, truth-seeking, and tangible impact.Having achieved a remarkable 150% revenue growth quarterly for the past year, we have successfully raised $81 million from top-tier venture capitalists and renowned angel investors.
Full-time|$150K/yr - $300K/yr|On-site|San Francisco / Bay Area
Position OverviewAt Sentra, we are pioneering the development of organizational superintelligence through innovative memory infrastructure that intelligently processes time, causality, and context. As a Machine Learning Research Scientist, you will address fundamental challenges in knowledge representation, temporal reasoning, and semantic compression. Your mission will be to design and implement sophisticated systems that preserve the execution state for entire organizations, transforming millions of micro-events into robust knowledge and identifying patterns for predicting future events.Key ResponsibilitiesDevelop LLM-powered information extraction pipelines to convert unstructured communications and textual data into structured entity-relationship models.Create memory consolidation algorithms that validate information through multiple observations, merge duplicate entities, and efficiently prune transient data.Architect temporal knowledge graphs that represent organizational execution states as dynamic, continuously updated frameworks instead of static records.Implement graph attention mechanisms and reasoning systems for intricate causal queries regarding blockers, dependencies, and outcome patterns.Conduct research on lossy semantic compression using information-theoretic principles to distill event streams into query-relevant long-term memory.Design entity resolution systems that effectively manage identity evolution, where entities may merge, split, and transform over time.Construct meta-learning systems that uncover organizational patterns and discern when current situations align with historical indicators of success or failure.Innovate privacy-preserving cross-organizational learning approaches utilizing federated learning and differential privacy techniques.Publish research findings and actively contribute to the wider research community focused on knowledge graphs and organizational intelligence.
Full-time|$273K/yr - $393K/yr|On-site|San Francisco, CA; Seattle, WA; New York, NY
At Scale AI, we are at the forefront of artificial intelligence, driving innovation through our advanced data, infrastructure, and tooling that empower the most sophisticated models worldwide. Our teams thrive at the intersection of pioneering research, extensive engineering, and practical deployment, collaborating with leading labs, enterprises, and government entities to explore the vast potential of Generative AI. As AI technology evolves from static models to dynamic, intelligent systems, Scale AI is dedicated to establishing the essential research foundations, evaluation methodologies, and reinforcement learning infrastructure that will shape this transformative era. Join our high-impact research organization, where you will contribute to advancing large language models, post-training evaluation, and agent-based reinforcement learning environments, influencing the future of AI development and implementation. As the Research Scientist Manager, you will spearhead a distinguished team of research scientists and engineers, define the strategic research roadmap, and oversee projects from initial prototyping to final deployment. You will excel in a fast-paced environment, harmonizing deep technical leadership with effective people management, visionary goal setting, and successful delivery.
Mar 26, 2026
Sign in to browse more jobs
Create account — see all 2,106 results
Tailoring 0 resumes…
Tailoring 0 resumes…
We'll move completed jobs to Ready to Apply automatically.