Research Scientist In Machine Learning And Robotics jobs in San Francisco – Browse 1,476 openings on RoboApply Jobs
Research Scientist In Machine Learning And Robotics jobs in San Francisco
Open roles matching “Research Scientist In Machine Learning And Robotics” with location signals for San Francisco. 1,476 active listings on RoboApply Jobs.
1,476 jobs found
Research Scientist in Machine Learning and Robotics
Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.
Unlock Your Potential
Generate Job-Optimized Resume
One Click And Our AI Optimizes Your Resume to Match The Job Description.
Is Your Resume Optimized For This Role?
Find Out If You're Highlighting The Right Skills And Fix What's Missing
Experience Level
Experience
Qualifications
Candidates should have a proven ability to conduct high-level research in machine learning and robotics, with skills that encompass both theoretical understanding and practical application. A strong background in system development, combined with innovative thinking in algorithm design, is essential. We seek individuals who can contribute to our team's collaborative spirit and commitment to pushing the boundaries of technology.
About the job
Join us at Physical Intelligence as a Research Scientist, where you will be at the forefront of innovation in machine learning and robotics. We are in search of exceptional researchers across all experience levels who demonstrate a strong track record of impactful research results. Ideal candidates will possess a solid foundation in both practical implementation and theoretical frameworks, showcasing a blend of system-building capabilities and significant conceptual, algorithmic, or theoretical advancements. We value diverse backgrounds and encourage applications from both traditional academic researchers and those with unique, unconventional experiences.
We are committed to fostering a diverse and inclusive workplace. In accordance with the San Francisco Fair Chance Ordinance, we welcome applications from qualified individuals with arrest and conviction records.
About Physical Intelligence
Physical Intelligence is dedicated to pioneering advancements in machine learning and robotics, creating solutions that enhance physical interactions with technology. Our team is passionate about driving innovation and fostering a culture of research excellence.
Join us at Physical Intelligence as a Research Scientist, where you will be at the forefront of innovation in machine learning and robotics. We are in search of exceptional researchers across all experience levels who demonstrate a strong track record of impactful research results. Ideal candidates will possess a solid foundation in both practical implementation and theoretical frameworks, showcasing a blend of system-building capabilities and significant conceptual, algorithmic, or theoretical advancements. We value diverse backgrounds and encourage applications from both traditional academic researchers and those with unique, unconventional experiences.We are committed to fostering a diverse and inclusive workplace. In accordance with the San Francisco Fair Chance Ordinance, we welcome applications from qualified individuals with arrest and conviction records.
Full-time|On-site|San Francisco Bay Area (San Mateo) or Boston (Somerville)
About the RoleAs a Research Scientist focused on Pretraining, you will develop the foundational intelligence layer for robotics. Our mission involves training expansive robot foundation models utilizing vast multimodal datasets that encompass video, proprioception, action traces, language, and beyond. You will lead and execute large-scale training initiatives that imbue our models with groundbreaking general capabilities applicable across various embodiments, tasks, and environments. Your work will involve deeply engaging with all facets of robotic data.Key Responsibilities:Design and conduct extensive pretraining efforts for robot foundation models, employing transformer and diffusion architectures.Establish model architectures, objectives, and training curricula that leverage multimodal robotic data, including vision, action, state, and language inputs.Create scalable data mixtures and sampling strategies to effectively utilize petabyte-scale datasets.Direct data collection operations and explore new avenues for dataset sourcing.Conduct ablation studies to uncover insights regarding scaling laws, data quality impacts, and architectural trade-offs.Collaborate closely with ML Infrastructure and Systems teams to enhance cluster utilization, throughput, and reliability.Transform raw robotic interaction data into versatile model capabilities.Ideal Candidate Profile:Extensive experience in training large transformer or diffusion models at scale, particularly in generative tasks such as language, audio, or video modeling.Proven leadership or significant contribution to multi-node, multi-GPU distributed training initiatives.Experience with scaling laws, optimization dynamics, and understanding large-model failure modes.Strong foundation in PyTorch and comfort in debugging across all layers of the computational stack.Appreciation for empirical rigor paired with rapid iteration speed.Enthusiasm for building general-purpose robot intelligence from foundational principles.About GeneralistAt Generalist, we are dedicated to realizing the potential of general-purpose robots. We envision a future where industries and households thrive through innovative collaborations between humans and machines. Our robots are designed to enhance productivity and facilitate the achievement of more ambitious goals.
At Causal Labs, we are on a groundbreaking mission to develop general causal intelligence—artificial intelligence that not only predicts future events but also determines the most effective actions to influence those outcomes.To achieve this monumental goal, we are constructing a Large Physics Foundation Model (LPM). Our focus is on domains governed by physical laws, which inherently exhibit cause-and-effect relationships, setting them apart from traditional visual or textual data.Weather serves as the ideal training environment for our LPM, being one of the most extensively observed physical systems available. It provides immediate, objective feedback from sensory observations and boasts data scales significantly larger than those currently employed to train existing language models.Our team at Causal Labs includes leading researchers and engineers with backgrounds in self-driving technology, drug discovery, and robotics, hailing from prestigious organizations such as Google DeepMind, Cruise, Waymo, Meta, Nabla Bio, and Apple. We firmly believe that achieving general causal intelligence will represent one of the most critical technological advancements for our civilization.We are seeking innovative researchers eager to confront unsolved challenges in the field.This role presents an opportunity to create powerful models rooted in observable feedback and verifiable ground truths. If you possess experience in pioneering research and training large-scale models from the ground up in areas such as language and vision models, robotics, or biology, we invite you to join our mission.
Be Part of the Future of Autonomous RoboticsAt Bedrock Robotics, we are pioneering the transition of AI from theoretical frameworks to practical applications in the built environment. Our team is comprised of seasoned professionals who have been instrumental in the success of innovative companies such as Waymo, Segment, and Uber Freight. We are at the forefront of deploying autonomous technologies in heavy construction machinery, significantly enhancing the efficiency and safety of multi-billion dollar infrastructure projects across the nation.With backing from $350 million in funding, our mission is to address the urgent need for housing, data centers, and manufacturing facilities, while simultaneously responding to the construction industry's labor shortages.This position is where cutting-edge algorithms meet the practical world of construction. You will work alongside industry experts and top-tier engineers to tackle complex real-world challenges that cannot be simulated. If you are eager to leverage advanced technology for impactful problem-solving within a skilled team, we encourage you to apply.
Join Handshake as a Machine Learning Research Scientist and contribute to groundbreaking projects that leverage advanced algorithms and data analysis to drive innovation. In this role, you will collaborate with a dynamic team to design, implement, and evaluate machine learning models that enhance our products and services. Your expertise will be pivotal in unlocking new insights from data, improving user experiences, and shaping the future of our technology.
Full-time|$150K/yr - $150K/yr|On-site|San Francisco
Become a Pioneer in Sleep FitnessAt Eight Sleep, we're dedicated to unlocking human potential through optimal sleep. As the world's first sleep fitness company, we are revolutionizing what it means to be well-rested by creating the most advanced hardware, software, and AI technology. Our innovative products enhance mental, physical, and emotional performance by transforming each night into a personalized, data-driven recovery journey. Trusted by high achievers, professional athletes, and health-conscious individuals across over 30 countries, we have been recognized by Fast Company as one of the Most Innovative Companies in 2019, 2022, and 2023, and honored twice by TIME's “Best Inventions of the Year.” Our high-performance team operates with speed, focus, and a commitment to impact. We don't just create; we refine and obsess over every detail to ensure our members sleep better and wake up stronger.Every position at Eight Sleep offers the opportunity to innovate cutting-edge technology, collaborate with exceptional talent, and contribute to a future where sleep is a powerful tool for well-being. If you're ready to break away from the ordinary and eager to build at the forefront of possibility, this is your chance to join us in reshaping how the world sleeps and what we can achieve upon waking.Our Culture: High Standards, No CompromiseOur mission demands intensity, and at Eight Sleep, we embody the mindset of the world's top performers: focused, relentless, and committed to being in the top 1% of our field. Inspired by the relentless drive of legends like Kobe Bryant, we apply that mentality to bold ideas, next-gen technology, and impeccable execution. This is not a standard 9-to-5 role; our team is dedicated, often working 60+ hours per week—not out of obligation, but out of passion. If you thrive under pressure and seek to do the most meaningful work of your career, you'll find a home here. If you prefer an easier path, this position is not for you.Your RoleAs a Machine Learning Research Scientist at Eight Sleep, you will be at the cutting edge of sleep innovation. Your mission will be to leverage innovative technology, minimalistic design, and proven clinical science to personalize and enhance sleep experiences, fundamentally changing how people sleep for the better.Our revolutionary temperature-regulated technology, the Pod, has been recognized as a game changer, enhancing health and happiness by transforming sleep. Join us in making sleep count for more.
Full-time|$275K/yr - $350K/yr|On-site|San Francisco, CA; Seattle, WA; New York, NY
About Scale AI At Scale AI, we are dedicated to propelling the advancement of AI applications. Over the past eight years, we have established ourselves as the premier AI data foundry, supporting groundbreaking innovations in fields such as generative AI, defense technologies, and autonomous vehicles. Following our recent Series F funding round, we are intensifying our efforts to harness frontier data, paving the way toward achieving Artificial General Intelligence (AGI). Our work with enterprise clients and governments has enhanced our model evaluation capabilities, allowing us to expand our offerings for both public and private evaluations. About the ACE Team The Agent Capabilities & Environments (ACE) team, a vital part of Scale’s Research organization, unites customer-focused Researchers and Applied AI Engineers. Our primary mission is to conduct research on agent environments and reinforcement learning reward signals, benchmark autonomous agent performance in real-world contexts, and develop robust data programs aimed at enhancing the capabilities of Large Language Models (LLMs). We are committed to creating foundational tools and frameworks for evaluating models as agents, focusing on autonomous agents that interact dynamically with a wide range of external environments, including code repositories and GUI interfaces. About This Role This position sits at the cutting edge of AI research and its practical applications, concentrating on the data types necessary for the development of state-of-the-art agents, including browser and software engineering agents. The ideal candidate will investigate the data landscape required to propel intelligent and adaptable AI agents, steering the data strategy at Scale to foster innovation. This role demands not only expertise in LLM agents and planning algorithms but also creative problem-solving skills to tackle novel challenges pertaining to data, interaction, and evaluation. You will contribute to influential research publications on agents, collaborate with customer researchers, and partner with the engineering team to transform these advancements into scalable real-world solutions.
Full-time|$252K/yr - $315K/yr|On-site|San Francisco, CA; Seattle, WA; New York, NY
About Scale AI At Scale AI, we are committed to propelling the advancement of AI technologies. For over eight years, we have been a pioneer in the AI data sector, supporting groundbreaking innovations in areas such as generative AI, defense solutions, and autonomous driving. Following our recent Series F funding round, we are enhancing access to premium data to accelerate the journey towards Artificial General Intelligence (AGI). Building on our legacy of model evaluation for both enterprise and governmental clients, we are expanding our capabilities to establish new benchmarks for evaluations in both public and private domains. About This Role This position is at the leading edge of AI research and practical implementation, concentrating on reasoning within large language models (LLMs). The successful candidate will investigate critical data types vital for evolving LLM-based agents, including browser and software engineering agents. You will significantly influence Scale’s data strategy by pinpointing optimal data sources and methodologies to enhance LLM reasoning. To excel in this role, you will require a profound understanding of LLMs, planning algorithms, and fresh approaches to agentic reasoning, alongside inventive solutions to challenges in data generation, model interaction, and evaluation. Your contributions will lead to transformative research on language model reasoning, facilitate collaboration with external researchers, and engage closely with engineering teams to translate cutting-edge advancements into scalable, real-world applications.
At Merge Labs, we are at the forefront of research, dedicated to uniting biological and artificial intelligence to enhance human capability, autonomy, and overall experience. Our innovative approach focuses on developing revolutionary brain-computer interfaces that offer high-bandwidth interaction with the brain, seamlessly integrate advanced AI, and are designed to be safe and accessible for everyone.About the Team:Our Bio team is responsible for designing, constructing, and characterizing the biotechnologies that underpin the next generation of brain-computer interfaces. By integrating molecular engineering, synthetic biology, neuroscience, and cutting-edge physical methods such as ultrasound, we aim to establish less invasive, high-bandwidth connections with neurons. The Bio team is dedicated to developing our core molecular technologies, validating their performance both in vitro and in vivo, and showcasing their advanced capabilities in animal models. We create custom experimental setups and pipelines while collaborating closely with engineers and data scientists to tackle some of the most challenging problems in biotechnology.About the Role:We are seeking a Senior/Principal Machine Learning Biophysicist to spearhead the creation of scalable molecular dynamics pipelines, integrating physics-based models with machine learning frameworks. You will build the molecular modeling foundations of the company from first principles, establishing tools and workflows for simulating, analyzing, and interpreting biomolecular dynamics to elucidate function relationships. Over time, your contributions will help translate these frameworks into predictive models that expedite molecular engineering, guide experimental campaigns, and facilitate the discovery of highly functional molecules.Key Responsibilities:Develop the scientific and engineering framework for protein structure modeling and molecular dynamics, along with integrations into downstream ML frameworks.Collaborate with wet-lab scientists to establish realistic optimization objectives and encode domain-specific priors and constraints.Prototype modeling frameworks utilizing internal and public datasets; benchmark and validate performance.Make complex analyses accessible to non-domain experts through democratization of first-principles analysis.Lead the development of ML frameworks that explicitly incorporate first-principles priors.Stay abreast of the latest advancements in deep learning and molecular dynamics.
Full-time|$252K/yr - $315K/yr|On-site|San Francisco, CA; Seattle, WA; New York, NY
At Scale AI, we collaborate with leading AI laboratories to supply high-quality data and foster advancements in Generative AI research. We seek innovative Research Scientists and Research Engineers with a strong focus on post-training techniques for Large Language Models (LLMs), including Supervised Fine-Tuning (SFT), Reinforcement Learning from Human Feedback (RLHF), and reward modeling. This position emphasizes optimizing data curation and evaluation processes to boost LLM performance across text and multimodal formats. In this pivotal role, you will pioneer new methods to enhance the alignment and generalization of extensive generative models. You will work closely with fellow researchers and engineers to establish best practices in data-driven AI development. Additionally, you will collaborate with top foundation model labs, providing critical technical and strategic insights for the evolution of next-generation generative AI models.
About Wispr FlowAt Wispr Flow, we strive to make device interaction as seamless as conversing with a friend.Wispr Flow has revolutionized voice dictation, now preferred by users over traditional keyboards due to its unparalleled accuracy on the first attempt. Our platform is context-aware, personalized, and effective across all devices, whether desktop or mobile.By 2026, we aim to expand beyond dictation to develop native actions within an agentic framework that comprehends and responds to user needs reliably.Our diverse team comprises AI researchers, designers, growth specialists, and engineers dedicated to reimagining human-computer interaction. We value team members who prioritize open communication, exhibit a user-centric mindset, and pay meticulous attention to detail. Our collaborative environment fosters spirited discussions, truth-seeking, and tangible impact.Having achieved a remarkable 150% revenue growth quarterly for the past year, we have successfully raised $81 million from top-tier venture capitalists and renowned angel investors.
Full-time|$150K/yr - $300K/yr|On-site|San Francisco / Bay Area
Position OverviewAt Sentra, we are pioneering the development of organizational superintelligence through innovative memory infrastructure that intelligently processes time, causality, and context. As a Machine Learning Research Scientist, you will address fundamental challenges in knowledge representation, temporal reasoning, and semantic compression. Your mission will be to design and implement sophisticated systems that preserve the execution state for entire organizations, transforming millions of micro-events into robust knowledge and identifying patterns for predicting future events.Key ResponsibilitiesDevelop LLM-powered information extraction pipelines to convert unstructured communications and textual data into structured entity-relationship models.Create memory consolidation algorithms that validate information through multiple observations, merge duplicate entities, and efficiently prune transient data.Architect temporal knowledge graphs that represent organizational execution states as dynamic, continuously updated frameworks instead of static records.Implement graph attention mechanisms and reasoning systems for intricate causal queries regarding blockers, dependencies, and outcome patterns.Conduct research on lossy semantic compression using information-theoretic principles to distill event streams into query-relevant long-term memory.Design entity resolution systems that effectively manage identity evolution, where entities may merge, split, and transform over time.Construct meta-learning systems that uncover organizational patterns and discern when current situations align with historical indicators of success or failure.Innovate privacy-preserving cross-organizational learning approaches utilizing federated learning and differential privacy techniques.Publish research findings and actively contribute to the wider research community focused on knowledge graphs and organizational intelligence.
Full-time|$280K/yr - $380K/yr|On-site|San Francisco, CA; Seattle, WA; New York, NY
At Scale AI, we are the premier partner for data and evaluation in the rapidly evolving field of artificial intelligence. Our commitment to advancing the assessment and benchmarking of large language models (LLMs) positions us at the forefront of AI innovation. We are dedicated to creating leading-edge LLM evaluation methodologies that set new benchmarks for model performance. Our research teams collaborate with the top AI laboratories in the industry to provide high-quality data, accelerate progress in generative AI research, and inform what excellence looks like in this domain. As a Staff Machine Learning Research Scientist on our LLM Evals team, you will spearhead the creation of novel evaluation methodologies, metrics, and benchmarks to assess the strengths and weaknesses of cutting-edge LLMs. Your work will shape our internal strategies and influence the broader AI research community, making this role essential for establishing best practices in data-driven AI development.
Full-time|$273K/yr - $393K/yr|On-site|San Francisco, CA; Seattle, WA; New York, NY
At Scale AI, we are at the forefront of artificial intelligence, driving innovation through our advanced data, infrastructure, and tooling that empower the most sophisticated models worldwide. Our teams thrive at the intersection of pioneering research, extensive engineering, and practical deployment, collaborating with leading labs, enterprises, and government entities to explore the vast potential of Generative AI. As AI technology evolves from static models to dynamic, intelligent systems, Scale AI is dedicated to establishing the essential research foundations, evaluation methodologies, and reinforcement learning infrastructure that will shape this transformative era. Join our high-impact research organization, where you will contribute to advancing large language models, post-training evaluation, and agent-based reinforcement learning environments, influencing the future of AI development and implementation. As the Research Scientist Manager, you will spearhead a distinguished team of research scientists and engineers, define the strategic research roadmap, and oversee projects from initial prototyping to final deployment. You will excel in a fast-paced environment, harmonizing deep technical leadership with effective people management, visionary goal setting, and successful delivery.
Full-time|On-site|San Francisco Bay Area (San Mateo) or Boston (Somerville)
About the RoleIn the realm of machine learning, pretraining lays the foundation for a general model, while post-training refines that model, enhancing its utility, controllability, safety, and performance in real-world applications. As a Post-Training Research Scientist, you will transform large pretrained robot models into production-ready systems through methodologies such as fine-tuning, reinforcement learning, steering, human feedback, task specialization, evaluation, and on-robot validation at scale. This position offers a unique opportunity for individuals from diverse backgrounds to evolve into full-stack ML roboticists, adept at swiftly identifying challenges across machine learning and control domains. This is where innovative research converges with practical implementation.Your Responsibilities Include:Crafting fine-tuning and adaptation strategies tailored for specific robotic tasks and embodiments.Developing methodologies to enhance reliability, robustness, and controllability of robotic systems.Establishing evaluation frameworks to assess real-world robot performance beyond just offline metrics.Collaborating with ML infrastructure teams to optimize inference-time performance, including latency, stability, and memory usage.Utilizing advanced techniques such as imitation learning, reinforcement learning, distillation, synthetic data, and curriculum learning.Bridging the gap between model outputs and tangible outcomes in the physical world.You Might Excel in This Role If You:Possess experience in fine-tuning large models for downstream applications, including RLHF, imitation learning, reinforcement learning, distillation, and domain adaptation.Have a background in embodied AI, robotics, or real-world machine learning systems.Demonstrate a strong commitment to evaluation, benchmarking, and failure analysis.Are comfortable troubleshooting and debugging across the entire ML stack, from analyzing loss curves to understanding robot behavior.Enjoy rapid iteration and thrive on real-world feedback loops.Aspire to connect foundational models with practical deployment scenarios.About GeneralistAt Generalist, we are dedicated to realizing the vision of general-purpose robots. We envision a future where industries and homes benefit from collaborative interactions between humans and machines, enabling us to achieve more than ever before. Our focus is on building embodied foundation models, starting with dexterity, and advancing the frontiers of data, models, and hardware to empower robots to intelligently engage with their environments.
Join Our Team at Rad AIAt Rad AI, we are dedicated to transforming the healthcare landscape through the power of artificial intelligence. Established by a radiologist, our innovative AI solutions are revolutionizing the field of radiology, enhancing patient care, alleviating clinician burnout, and significantly reducing the time required for report generation. With access to one of the largest proprietary datasets of radiology reports globally, our AI technologies have facilitated the discovery of numerous new cancer diagnoses and halved error rates across tens of millions of reports.Having raised over $140 million in funding, including a highly successful Series C round of $68 million led by Transformation Capital, we are now valued at $528 million. Our prestigious investors, such as Khosla Ventures, World Innovation Lab, Gradient Ventures, and Cone Health Ventures, are all committed to supporting our mission to empower healthcare professionals with cutting-edge AI tools.Our latest breakthroughs in generative AI are utilized by thousands of radiologists every day, supporting nearly half of all medical imaging across the United States, in partnership with esteemed healthcare organizations like Cone Health, Jefferson Einstein Health, Geisinger, Guthrie Healthcare System, and Henry Ford Health.Recognized as one of the most promising healthcare AI companies by CB Insights and AuntMinnie, and ranked as the 19th fastest-growing company in North America by Deloitte, we are committed to creating AI-powered solutions that make a real difference. Recently, Rad AI was also featured on CNBC’s Disruptor 50 list, showcasing the innovation and momentum behind our mission.If you are eager to impact the future of healthcare positively, we would be thrilled to have you join our talented team!Why You Should Join UsWe are seeking a Staff Machine Learning Research Scientist to define and lead Rad AI's next wave of applied research in Natural Language Processing (NLP) and clinical AI. You will engage with large language models, retrieval systems, representation learning, speech processing, and multimodal modeling, prioritizing evaluation and reliability alongside achieving state-of-the-art outcomes. This role offers you the chance to take ownership of projects and establish a direct pathway from research to product implementation.As part of our team, you will work closely with clinicians, engineers, and product leaders to translate foundational research into practical applications that enhance healthcare delivery.
Join OpenAI as a Research Scientist and explore cutting-edge machine learning innovations. In this role, you will be at the forefront of developing groundbreaking techniques while advancing our team's research initiatives. Collaborate with talented peers across various teams to discover transformative ideas that scale effectively. We seek individuals who are passionate about pushing the boundaries of AI and want to contribute to our unified research vision.
About the TeamJoin the innovative Post-Training team at OpenAI, where we focus on refining and elevating pre-trained models for deployment in ChatGPT, our API, and future products. Collaborating closely with various research and product teams, we conduct crucial research that prepares our models for real-world deployment to millions of users, ensuring they are safe, efficient, and reliable.About the RoleAs a Research Engineer / Scientist, you will spearhead the research and development of enhancements to our models. Our work intersects reinforcement learning and product development, aiming to create cutting-edge solutions.We seek passionate individuals with robust machine learning engineering skills and research experience, particularly with innovative and powerful models. The ideal candidate will be driven by a commitment to product-oriented research.This position is located in San Francisco, CA, and follows a hybrid work model requiring three days in the office each week. Relocation assistance is available for new employees.In this role, you will:Lead and execute a research agenda aimed at enhancing model capabilities and performance.Work collaboratively with research and product teams to empower customers to optimize their models.Develop robust evaluation frameworks to monitor and assess modeling advancements.Design, implement, test, and debug code across our research stack.You may excel in this role if you:Possess a deep understanding of machine learning and its applications.Have experience with relevant models and methodologies for evaluating model improvements.Are adept at navigating large ML codebases for debugging purposes.Thrive in a fast-paced and technically intricate environment.About OpenAIOpenAI is a pioneering AI research and deployment organization dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We are committed to pushing the boundaries of AI capabilities while prioritizing safety and human-centric values in our products. Our mission is to embrace diverse perspectives, voices, and experiences that represent the full spectrum of humanity, as we strive for a future where AI is a powerful ally for everyone.
Join the forefront of technology as we revolutionize the construction industry with advanced autonomy.Be a Part of Innovation at Bedrock RoboticsAt Bedrock Robotics, we are transforming artificial intelligence from theoretical concepts into practical applications that enhance the world’s infrastructure. Our team comprises seasoned professionals who have been pivotal in launching industry leaders like Waymo, scaling Segment to a $3.2 billion acquisition, and driving Uber Freight to $5 billion in revenue. We are currently implementing autonomous systems in heavy construction machinery nationwide, expediting the timelines of multi-billion dollar infrastructure developments while significantly enhancing job site safety.With a robust funding of $350 million, we are rapidly addressing the escalating demand for housing, data centers, and manufacturing facilities, all while tackling the construction sector's increasing labor shortages. This is where innovative algorithms meet the realities of heavy machinery.If you are passionate about leveraging cutting-edge technology to tackle real-world challenges and wish to collaborate with a talented team of industry experts, we invite you to apply.Role: Machine Learning Engineer: EvaluationBedrock Robotics is on a mission to integrate autonomy into construction processes! We are seeking a driven engineer with substantial experience in evaluating complex machine learning systems in real-world scenarios. Your objective will be to convert the intricate dynamics of the built environment into actionable, AI-driven evaluations that enhance the adoption of our Bedrock Operators.The ideal candidate will have a proven track record in developing evaluation systems and executing statistical analyses to assess performance variations across system iterations. Your experience in iterating on complex machine learning systems in production environments will be crucial, as you navigate the intricacies involved.Your Responsibilities:Design and Maintain Evaluation Systems:Develop and maintain pipelines for performance measurement—encompassing both open-loop and closed-loop simulations, hardware-in-the-loop systems, and field data from Bedrock Operator-equipped machinery. Collaborate with other teams to glean insights earlier in the development cycle through optimized workflows.Develop Metrics:Align product objectives with system behaviors by translating real-world specifications into measurable indicators derived from logged data. Enable data-driven decision-making for everything from parameter adjustments to strategic program planning.
Join Arena Intelligence as a Machine Learning ScientistAt Arena Intelligence, we are revolutionizing how AI models are evaluated in real-world scenarios. Founded by innovative researchers from UC Berkeley’s SkyLab, our mission is to push the boundaries of AI evaluation and ensure its practical application.With millions of users engaging with our platform each month, we prioritize community feedback to develop transparent, rigorous, and human-centered model evaluations. Our leaderboards serve as the benchmark for AI performance, gaining the trust of leading enterprises and AI labs to understand the reliability, alignment, and impact of AI systems.Our diverse team comprises experts from esteemed institutions such as UC Berkeley, Google, Stanford, DeepMind, and Discord. We foster a culture that values truth, agility, craftsmanship, curiosity, and impact over hierarchy. We are committed to creating an environment where talented individuals from all backgrounds can excel in their work.Role OverviewWe are seeking a passionate Machine Learning Scientist to spearhead our open-source research initiatives, including the development of open datasets and code releases. You will be instrumental in advancing how AI models are evaluated and understood globally.In this position, you will operationalize our dedication to openness by curating impactful datasets, developing innovative methodologies, and establishing reproducible benchmarks. Your contributions will enhance our public leaderboards, empower community tools, and promote transparency in AI evaluation on a global scale.This interdisciplinary role involves collaboration with engineers, product teams, marketing, and the broader research community to refine model comparisons, analyze preference data, and explore dimensions like style, reasoning, and robustness. You will also work closely with our go-to-market teams to advocate for our open research initiatives, strengthen research partnerships, and encourage community engagement.If you are excited by complex challenges, rigorous evaluation processes, and scientific outreach, we invite you to apply!
Join us at Physical Intelligence as a Research Scientist, where you will be at the forefront of innovation in machine learning and robotics. We are in search of exceptional researchers across all experience levels who demonstrate a strong track record of impactful research results. Ideal candidates will possess a solid foundation in both practical implementation and theoretical frameworks, showcasing a blend of system-building capabilities and significant conceptual, algorithmic, or theoretical advancements. We value diverse backgrounds and encourage applications from both traditional academic researchers and those with unique, unconventional experiences.We are committed to fostering a diverse and inclusive workplace. In accordance with the San Francisco Fair Chance Ordinance, we welcome applications from qualified individuals with arrest and conviction records.
Full-time|On-site|San Francisco Bay Area (San Mateo) or Boston (Somerville)
About the RoleAs a Research Scientist focused on Pretraining, you will develop the foundational intelligence layer for robotics. Our mission involves training expansive robot foundation models utilizing vast multimodal datasets that encompass video, proprioception, action traces, language, and beyond. You will lead and execute large-scale training initiatives that imbue our models with groundbreaking general capabilities applicable across various embodiments, tasks, and environments. Your work will involve deeply engaging with all facets of robotic data.Key Responsibilities:Design and conduct extensive pretraining efforts for robot foundation models, employing transformer and diffusion architectures.Establish model architectures, objectives, and training curricula that leverage multimodal robotic data, including vision, action, state, and language inputs.Create scalable data mixtures and sampling strategies to effectively utilize petabyte-scale datasets.Direct data collection operations and explore new avenues for dataset sourcing.Conduct ablation studies to uncover insights regarding scaling laws, data quality impacts, and architectural trade-offs.Collaborate closely with ML Infrastructure and Systems teams to enhance cluster utilization, throughput, and reliability.Transform raw robotic interaction data into versatile model capabilities.Ideal Candidate Profile:Extensive experience in training large transformer or diffusion models at scale, particularly in generative tasks such as language, audio, or video modeling.Proven leadership or significant contribution to multi-node, multi-GPU distributed training initiatives.Experience with scaling laws, optimization dynamics, and understanding large-model failure modes.Strong foundation in PyTorch and comfort in debugging across all layers of the computational stack.Appreciation for empirical rigor paired with rapid iteration speed.Enthusiasm for building general-purpose robot intelligence from foundational principles.About GeneralistAt Generalist, we are dedicated to realizing the potential of general-purpose robots. We envision a future where industries and households thrive through innovative collaborations between humans and machines. Our robots are designed to enhance productivity and facilitate the achievement of more ambitious goals.
At Causal Labs, we are on a groundbreaking mission to develop general causal intelligence—artificial intelligence that not only predicts future events but also determines the most effective actions to influence those outcomes.To achieve this monumental goal, we are constructing a Large Physics Foundation Model (LPM). Our focus is on domains governed by physical laws, which inherently exhibit cause-and-effect relationships, setting them apart from traditional visual or textual data.Weather serves as the ideal training environment for our LPM, being one of the most extensively observed physical systems available. It provides immediate, objective feedback from sensory observations and boasts data scales significantly larger than those currently employed to train existing language models.Our team at Causal Labs includes leading researchers and engineers with backgrounds in self-driving technology, drug discovery, and robotics, hailing from prestigious organizations such as Google DeepMind, Cruise, Waymo, Meta, Nabla Bio, and Apple. We firmly believe that achieving general causal intelligence will represent one of the most critical technological advancements for our civilization.We are seeking innovative researchers eager to confront unsolved challenges in the field.This role presents an opportunity to create powerful models rooted in observable feedback and verifiable ground truths. If you possess experience in pioneering research and training large-scale models from the ground up in areas such as language and vision models, robotics, or biology, we invite you to join our mission.
Be Part of the Future of Autonomous RoboticsAt Bedrock Robotics, we are pioneering the transition of AI from theoretical frameworks to practical applications in the built environment. Our team is comprised of seasoned professionals who have been instrumental in the success of innovative companies such as Waymo, Segment, and Uber Freight. We are at the forefront of deploying autonomous technologies in heavy construction machinery, significantly enhancing the efficiency and safety of multi-billion dollar infrastructure projects across the nation.With backing from $350 million in funding, our mission is to address the urgent need for housing, data centers, and manufacturing facilities, while simultaneously responding to the construction industry's labor shortages.This position is where cutting-edge algorithms meet the practical world of construction. You will work alongside industry experts and top-tier engineers to tackle complex real-world challenges that cannot be simulated. If you are eager to leverage advanced technology for impactful problem-solving within a skilled team, we encourage you to apply.
Join Handshake as a Machine Learning Research Scientist and contribute to groundbreaking projects that leverage advanced algorithms and data analysis to drive innovation. In this role, you will collaborate with a dynamic team to design, implement, and evaluate machine learning models that enhance our products and services. Your expertise will be pivotal in unlocking new insights from data, improving user experiences, and shaping the future of our technology.
Full-time|$150K/yr - $150K/yr|On-site|San Francisco
Become a Pioneer in Sleep FitnessAt Eight Sleep, we're dedicated to unlocking human potential through optimal sleep. As the world's first sleep fitness company, we are revolutionizing what it means to be well-rested by creating the most advanced hardware, software, and AI technology. Our innovative products enhance mental, physical, and emotional performance by transforming each night into a personalized, data-driven recovery journey. Trusted by high achievers, professional athletes, and health-conscious individuals across over 30 countries, we have been recognized by Fast Company as one of the Most Innovative Companies in 2019, 2022, and 2023, and honored twice by TIME's “Best Inventions of the Year.” Our high-performance team operates with speed, focus, and a commitment to impact. We don't just create; we refine and obsess over every detail to ensure our members sleep better and wake up stronger.Every position at Eight Sleep offers the opportunity to innovate cutting-edge technology, collaborate with exceptional talent, and contribute to a future where sleep is a powerful tool for well-being. If you're ready to break away from the ordinary and eager to build at the forefront of possibility, this is your chance to join us in reshaping how the world sleeps and what we can achieve upon waking.Our Culture: High Standards, No CompromiseOur mission demands intensity, and at Eight Sleep, we embody the mindset of the world's top performers: focused, relentless, and committed to being in the top 1% of our field. Inspired by the relentless drive of legends like Kobe Bryant, we apply that mentality to bold ideas, next-gen technology, and impeccable execution. This is not a standard 9-to-5 role; our team is dedicated, often working 60+ hours per week—not out of obligation, but out of passion. If you thrive under pressure and seek to do the most meaningful work of your career, you'll find a home here. If you prefer an easier path, this position is not for you.Your RoleAs a Machine Learning Research Scientist at Eight Sleep, you will be at the cutting edge of sleep innovation. Your mission will be to leverage innovative technology, minimalistic design, and proven clinical science to personalize and enhance sleep experiences, fundamentally changing how people sleep for the better.Our revolutionary temperature-regulated technology, the Pod, has been recognized as a game changer, enhancing health and happiness by transforming sleep. Join us in making sleep count for more.
Full-time|$275K/yr - $350K/yr|On-site|San Francisco, CA; Seattle, WA; New York, NY
About Scale AI At Scale AI, we are dedicated to propelling the advancement of AI applications. Over the past eight years, we have established ourselves as the premier AI data foundry, supporting groundbreaking innovations in fields such as generative AI, defense technologies, and autonomous vehicles. Following our recent Series F funding round, we are intensifying our efforts to harness frontier data, paving the way toward achieving Artificial General Intelligence (AGI). Our work with enterprise clients and governments has enhanced our model evaluation capabilities, allowing us to expand our offerings for both public and private evaluations. About the ACE Team The Agent Capabilities & Environments (ACE) team, a vital part of Scale’s Research organization, unites customer-focused Researchers and Applied AI Engineers. Our primary mission is to conduct research on agent environments and reinforcement learning reward signals, benchmark autonomous agent performance in real-world contexts, and develop robust data programs aimed at enhancing the capabilities of Large Language Models (LLMs). We are committed to creating foundational tools and frameworks for evaluating models as agents, focusing on autonomous agents that interact dynamically with a wide range of external environments, including code repositories and GUI interfaces. About This Role This position sits at the cutting edge of AI research and its practical applications, concentrating on the data types necessary for the development of state-of-the-art agents, including browser and software engineering agents. The ideal candidate will investigate the data landscape required to propel intelligent and adaptable AI agents, steering the data strategy at Scale to foster innovation. This role demands not only expertise in LLM agents and planning algorithms but also creative problem-solving skills to tackle novel challenges pertaining to data, interaction, and evaluation. You will contribute to influential research publications on agents, collaborate with customer researchers, and partner with the engineering team to transform these advancements into scalable real-world solutions.
Full-time|$252K/yr - $315K/yr|On-site|San Francisco, CA; Seattle, WA; New York, NY
About Scale AI At Scale AI, we are committed to propelling the advancement of AI technologies. For over eight years, we have been a pioneer in the AI data sector, supporting groundbreaking innovations in areas such as generative AI, defense solutions, and autonomous driving. Following our recent Series F funding round, we are enhancing access to premium data to accelerate the journey towards Artificial General Intelligence (AGI). Building on our legacy of model evaluation for both enterprise and governmental clients, we are expanding our capabilities to establish new benchmarks for evaluations in both public and private domains. About This Role This position is at the leading edge of AI research and practical implementation, concentrating on reasoning within large language models (LLMs). The successful candidate will investigate critical data types vital for evolving LLM-based agents, including browser and software engineering agents. You will significantly influence Scale’s data strategy by pinpointing optimal data sources and methodologies to enhance LLM reasoning. To excel in this role, you will require a profound understanding of LLMs, planning algorithms, and fresh approaches to agentic reasoning, alongside inventive solutions to challenges in data generation, model interaction, and evaluation. Your contributions will lead to transformative research on language model reasoning, facilitate collaboration with external researchers, and engage closely with engineering teams to translate cutting-edge advancements into scalable, real-world applications.
At Merge Labs, we are at the forefront of research, dedicated to uniting biological and artificial intelligence to enhance human capability, autonomy, and overall experience. Our innovative approach focuses on developing revolutionary brain-computer interfaces that offer high-bandwidth interaction with the brain, seamlessly integrate advanced AI, and are designed to be safe and accessible for everyone.About the Team:Our Bio team is responsible for designing, constructing, and characterizing the biotechnologies that underpin the next generation of brain-computer interfaces. By integrating molecular engineering, synthetic biology, neuroscience, and cutting-edge physical methods such as ultrasound, we aim to establish less invasive, high-bandwidth connections with neurons. The Bio team is dedicated to developing our core molecular technologies, validating their performance both in vitro and in vivo, and showcasing their advanced capabilities in animal models. We create custom experimental setups and pipelines while collaborating closely with engineers and data scientists to tackle some of the most challenging problems in biotechnology.About the Role:We are seeking a Senior/Principal Machine Learning Biophysicist to spearhead the creation of scalable molecular dynamics pipelines, integrating physics-based models with machine learning frameworks. You will build the molecular modeling foundations of the company from first principles, establishing tools and workflows for simulating, analyzing, and interpreting biomolecular dynamics to elucidate function relationships. Over time, your contributions will help translate these frameworks into predictive models that expedite molecular engineering, guide experimental campaigns, and facilitate the discovery of highly functional molecules.Key Responsibilities:Develop the scientific and engineering framework for protein structure modeling and molecular dynamics, along with integrations into downstream ML frameworks.Collaborate with wet-lab scientists to establish realistic optimization objectives and encode domain-specific priors and constraints.Prototype modeling frameworks utilizing internal and public datasets; benchmark and validate performance.Make complex analyses accessible to non-domain experts through democratization of first-principles analysis.Lead the development of ML frameworks that explicitly incorporate first-principles priors.Stay abreast of the latest advancements in deep learning and molecular dynamics.
Full-time|$252K/yr - $315K/yr|On-site|San Francisco, CA; Seattle, WA; New York, NY
At Scale AI, we collaborate with leading AI laboratories to supply high-quality data and foster advancements in Generative AI research. We seek innovative Research Scientists and Research Engineers with a strong focus on post-training techniques for Large Language Models (LLMs), including Supervised Fine-Tuning (SFT), Reinforcement Learning from Human Feedback (RLHF), and reward modeling. This position emphasizes optimizing data curation and evaluation processes to boost LLM performance across text and multimodal formats. In this pivotal role, you will pioneer new methods to enhance the alignment and generalization of extensive generative models. You will work closely with fellow researchers and engineers to establish best practices in data-driven AI development. Additionally, you will collaborate with top foundation model labs, providing critical technical and strategic insights for the evolution of next-generation generative AI models.
About Wispr FlowAt Wispr Flow, we strive to make device interaction as seamless as conversing with a friend.Wispr Flow has revolutionized voice dictation, now preferred by users over traditional keyboards due to its unparalleled accuracy on the first attempt. Our platform is context-aware, personalized, and effective across all devices, whether desktop or mobile.By 2026, we aim to expand beyond dictation to develop native actions within an agentic framework that comprehends and responds to user needs reliably.Our diverse team comprises AI researchers, designers, growth specialists, and engineers dedicated to reimagining human-computer interaction. We value team members who prioritize open communication, exhibit a user-centric mindset, and pay meticulous attention to detail. Our collaborative environment fosters spirited discussions, truth-seeking, and tangible impact.Having achieved a remarkable 150% revenue growth quarterly for the past year, we have successfully raised $81 million from top-tier venture capitalists and renowned angel investors.
Full-time|$150K/yr - $300K/yr|On-site|San Francisco / Bay Area
Position OverviewAt Sentra, we are pioneering the development of organizational superintelligence through innovative memory infrastructure that intelligently processes time, causality, and context. As a Machine Learning Research Scientist, you will address fundamental challenges in knowledge representation, temporal reasoning, and semantic compression. Your mission will be to design and implement sophisticated systems that preserve the execution state for entire organizations, transforming millions of micro-events into robust knowledge and identifying patterns for predicting future events.Key ResponsibilitiesDevelop LLM-powered information extraction pipelines to convert unstructured communications and textual data into structured entity-relationship models.Create memory consolidation algorithms that validate information through multiple observations, merge duplicate entities, and efficiently prune transient data.Architect temporal knowledge graphs that represent organizational execution states as dynamic, continuously updated frameworks instead of static records.Implement graph attention mechanisms and reasoning systems for intricate causal queries regarding blockers, dependencies, and outcome patterns.Conduct research on lossy semantic compression using information-theoretic principles to distill event streams into query-relevant long-term memory.Design entity resolution systems that effectively manage identity evolution, where entities may merge, split, and transform over time.Construct meta-learning systems that uncover organizational patterns and discern when current situations align with historical indicators of success or failure.Innovate privacy-preserving cross-organizational learning approaches utilizing federated learning and differential privacy techniques.Publish research findings and actively contribute to the wider research community focused on knowledge graphs and organizational intelligence.
Full-time|$280K/yr - $380K/yr|On-site|San Francisco, CA; Seattle, WA; New York, NY
At Scale AI, we are the premier partner for data and evaluation in the rapidly evolving field of artificial intelligence. Our commitment to advancing the assessment and benchmarking of large language models (LLMs) positions us at the forefront of AI innovation. We are dedicated to creating leading-edge LLM evaluation methodologies that set new benchmarks for model performance. Our research teams collaborate with the top AI laboratories in the industry to provide high-quality data, accelerate progress in generative AI research, and inform what excellence looks like in this domain. As a Staff Machine Learning Research Scientist on our LLM Evals team, you will spearhead the creation of novel evaluation methodologies, metrics, and benchmarks to assess the strengths and weaknesses of cutting-edge LLMs. Your work will shape our internal strategies and influence the broader AI research community, making this role essential for establishing best practices in data-driven AI development.
Full-time|$273K/yr - $393K/yr|On-site|San Francisco, CA; Seattle, WA; New York, NY
At Scale AI, we are at the forefront of artificial intelligence, driving innovation through our advanced data, infrastructure, and tooling that empower the most sophisticated models worldwide. Our teams thrive at the intersection of pioneering research, extensive engineering, and practical deployment, collaborating with leading labs, enterprises, and government entities to explore the vast potential of Generative AI. As AI technology evolves from static models to dynamic, intelligent systems, Scale AI is dedicated to establishing the essential research foundations, evaluation methodologies, and reinforcement learning infrastructure that will shape this transformative era. Join our high-impact research organization, where you will contribute to advancing large language models, post-training evaluation, and agent-based reinforcement learning environments, influencing the future of AI development and implementation. As the Research Scientist Manager, you will spearhead a distinguished team of research scientists and engineers, define the strategic research roadmap, and oversee projects from initial prototyping to final deployment. You will excel in a fast-paced environment, harmonizing deep technical leadership with effective people management, visionary goal setting, and successful delivery.
Full-time|On-site|San Francisco Bay Area (San Mateo) or Boston (Somerville)
About the RoleIn the realm of machine learning, pretraining lays the foundation for a general model, while post-training refines that model, enhancing its utility, controllability, safety, and performance in real-world applications. As a Post-Training Research Scientist, you will transform large pretrained robot models into production-ready systems through methodologies such as fine-tuning, reinforcement learning, steering, human feedback, task specialization, evaluation, and on-robot validation at scale. This position offers a unique opportunity for individuals from diverse backgrounds to evolve into full-stack ML roboticists, adept at swiftly identifying challenges across machine learning and control domains. This is where innovative research converges with practical implementation.Your Responsibilities Include:Crafting fine-tuning and adaptation strategies tailored for specific robotic tasks and embodiments.Developing methodologies to enhance reliability, robustness, and controllability of robotic systems.Establishing evaluation frameworks to assess real-world robot performance beyond just offline metrics.Collaborating with ML infrastructure teams to optimize inference-time performance, including latency, stability, and memory usage.Utilizing advanced techniques such as imitation learning, reinforcement learning, distillation, synthetic data, and curriculum learning.Bridging the gap between model outputs and tangible outcomes in the physical world.You Might Excel in This Role If You:Possess experience in fine-tuning large models for downstream applications, including RLHF, imitation learning, reinforcement learning, distillation, and domain adaptation.Have a background in embodied AI, robotics, or real-world machine learning systems.Demonstrate a strong commitment to evaluation, benchmarking, and failure analysis.Are comfortable troubleshooting and debugging across the entire ML stack, from analyzing loss curves to understanding robot behavior.Enjoy rapid iteration and thrive on real-world feedback loops.Aspire to connect foundational models with practical deployment scenarios.About GeneralistAt Generalist, we are dedicated to realizing the vision of general-purpose robots. We envision a future where industries and homes benefit from collaborative interactions between humans and machines, enabling us to achieve more than ever before. Our focus is on building embodied foundation models, starting with dexterity, and advancing the frontiers of data, models, and hardware to empower robots to intelligently engage with their environments.
Join Our Team at Rad AIAt Rad AI, we are dedicated to transforming the healthcare landscape through the power of artificial intelligence. Established by a radiologist, our innovative AI solutions are revolutionizing the field of radiology, enhancing patient care, alleviating clinician burnout, and significantly reducing the time required for report generation. With access to one of the largest proprietary datasets of radiology reports globally, our AI technologies have facilitated the discovery of numerous new cancer diagnoses and halved error rates across tens of millions of reports.Having raised over $140 million in funding, including a highly successful Series C round of $68 million led by Transformation Capital, we are now valued at $528 million. Our prestigious investors, such as Khosla Ventures, World Innovation Lab, Gradient Ventures, and Cone Health Ventures, are all committed to supporting our mission to empower healthcare professionals with cutting-edge AI tools.Our latest breakthroughs in generative AI are utilized by thousands of radiologists every day, supporting nearly half of all medical imaging across the United States, in partnership with esteemed healthcare organizations like Cone Health, Jefferson Einstein Health, Geisinger, Guthrie Healthcare System, and Henry Ford Health.Recognized as one of the most promising healthcare AI companies by CB Insights and AuntMinnie, and ranked as the 19th fastest-growing company in North America by Deloitte, we are committed to creating AI-powered solutions that make a real difference. Recently, Rad AI was also featured on CNBC’s Disruptor 50 list, showcasing the innovation and momentum behind our mission.If you are eager to impact the future of healthcare positively, we would be thrilled to have you join our talented team!Why You Should Join UsWe are seeking a Staff Machine Learning Research Scientist to define and lead Rad AI's next wave of applied research in Natural Language Processing (NLP) and clinical AI. You will engage with large language models, retrieval systems, representation learning, speech processing, and multimodal modeling, prioritizing evaluation and reliability alongside achieving state-of-the-art outcomes. This role offers you the chance to take ownership of projects and establish a direct pathway from research to product implementation.As part of our team, you will work closely with clinicians, engineers, and product leaders to translate foundational research into practical applications that enhance healthcare delivery.
Join OpenAI as a Research Scientist and explore cutting-edge machine learning innovations. In this role, you will be at the forefront of developing groundbreaking techniques while advancing our team's research initiatives. Collaborate with talented peers across various teams to discover transformative ideas that scale effectively. We seek individuals who are passionate about pushing the boundaries of AI and want to contribute to our unified research vision.
About the TeamJoin the innovative Post-Training team at OpenAI, where we focus on refining and elevating pre-trained models for deployment in ChatGPT, our API, and future products. Collaborating closely with various research and product teams, we conduct crucial research that prepares our models for real-world deployment to millions of users, ensuring they are safe, efficient, and reliable.About the RoleAs a Research Engineer / Scientist, you will spearhead the research and development of enhancements to our models. Our work intersects reinforcement learning and product development, aiming to create cutting-edge solutions.We seek passionate individuals with robust machine learning engineering skills and research experience, particularly with innovative and powerful models. The ideal candidate will be driven by a commitment to product-oriented research.This position is located in San Francisco, CA, and follows a hybrid work model requiring three days in the office each week. Relocation assistance is available for new employees.In this role, you will:Lead and execute a research agenda aimed at enhancing model capabilities and performance.Work collaboratively with research and product teams to empower customers to optimize their models.Develop robust evaluation frameworks to monitor and assess modeling advancements.Design, implement, test, and debug code across our research stack.You may excel in this role if you:Possess a deep understanding of machine learning and its applications.Have experience with relevant models and methodologies for evaluating model improvements.Are adept at navigating large ML codebases for debugging purposes.Thrive in a fast-paced and technically intricate environment.About OpenAIOpenAI is a pioneering AI research and deployment organization dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We are committed to pushing the boundaries of AI capabilities while prioritizing safety and human-centric values in our products. Our mission is to embrace diverse perspectives, voices, and experiences that represent the full spectrum of humanity, as we strive for a future where AI is a powerful ally for everyone.
Join the forefront of technology as we revolutionize the construction industry with advanced autonomy.Be a Part of Innovation at Bedrock RoboticsAt Bedrock Robotics, we are transforming artificial intelligence from theoretical concepts into practical applications that enhance the world’s infrastructure. Our team comprises seasoned professionals who have been pivotal in launching industry leaders like Waymo, scaling Segment to a $3.2 billion acquisition, and driving Uber Freight to $5 billion in revenue. We are currently implementing autonomous systems in heavy construction machinery nationwide, expediting the timelines of multi-billion dollar infrastructure developments while significantly enhancing job site safety.With a robust funding of $350 million, we are rapidly addressing the escalating demand for housing, data centers, and manufacturing facilities, all while tackling the construction sector's increasing labor shortages. This is where innovative algorithms meet the realities of heavy machinery.If you are passionate about leveraging cutting-edge technology to tackle real-world challenges and wish to collaborate with a talented team of industry experts, we invite you to apply.Role: Machine Learning Engineer: EvaluationBedrock Robotics is on a mission to integrate autonomy into construction processes! We are seeking a driven engineer with substantial experience in evaluating complex machine learning systems in real-world scenarios. Your objective will be to convert the intricate dynamics of the built environment into actionable, AI-driven evaluations that enhance the adoption of our Bedrock Operators.The ideal candidate will have a proven track record in developing evaluation systems and executing statistical analyses to assess performance variations across system iterations. Your experience in iterating on complex machine learning systems in production environments will be crucial, as you navigate the intricacies involved.Your Responsibilities:Design and Maintain Evaluation Systems:Develop and maintain pipelines for performance measurement—encompassing both open-loop and closed-loop simulations, hardware-in-the-loop systems, and field data from Bedrock Operator-equipped machinery. Collaborate with other teams to glean insights earlier in the development cycle through optimized workflows.Develop Metrics:Align product objectives with system behaviors by translating real-world specifications into measurable indicators derived from logged data. Enable data-driven decision-making for everything from parameter adjustments to strategic program planning.
Join Arena Intelligence as a Machine Learning ScientistAt Arena Intelligence, we are revolutionizing how AI models are evaluated in real-world scenarios. Founded by innovative researchers from UC Berkeley’s SkyLab, our mission is to push the boundaries of AI evaluation and ensure its practical application.With millions of users engaging with our platform each month, we prioritize community feedback to develop transparent, rigorous, and human-centered model evaluations. Our leaderboards serve as the benchmark for AI performance, gaining the trust of leading enterprises and AI labs to understand the reliability, alignment, and impact of AI systems.Our diverse team comprises experts from esteemed institutions such as UC Berkeley, Google, Stanford, DeepMind, and Discord. We foster a culture that values truth, agility, craftsmanship, curiosity, and impact over hierarchy. We are committed to creating an environment where talented individuals from all backgrounds can excel in their work.Role OverviewWe are seeking a passionate Machine Learning Scientist to spearhead our open-source research initiatives, including the development of open datasets and code releases. You will be instrumental in advancing how AI models are evaluated and understood globally.In this position, you will operationalize our dedication to openness by curating impactful datasets, developing innovative methodologies, and establishing reproducible benchmarks. Your contributions will enhance our public leaderboards, empower community tools, and promote transparency in AI evaluation on a global scale.This interdisciplinary role involves collaboration with engineers, product teams, marketing, and the broader research community to refine model comparisons, analyze preference data, and explore dimensions like style, reasoning, and robustness. You will also work closely with our go-to-market teams to advocate for our open research initiatives, strengthen research partnerships, and encourage community engagement.If you are excited by complex challenges, rigorous evaluation processes, and scientific outreach, we invite you to apply!
Dec 18, 2025
Sign in to browse more jobs
Create account — see all 1,476 results
Tailoring 0 resumes…
Tailoring 0 resumes…
We'll move completed jobs to Ready to Apply automatically.