WaymoMountain View, CA USA; San Francisco, CA USA;
Hybrid Full-time
Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.
Unlock Your Potential
Generate Job-Optimized Resume
One Click And Our AI Optimizes Your Resume to Match The Job Description.
Is Your Resume Optimized For This Role?
Find Out If You're Highlighting The Right Skills And Fix What's Missing
Experience Level
Senior
Qualifications
Proven experience in machine learning, particularly in perception systems. Strong programming skills in Python and C++. Experience with large datasets and data processing frameworks. Familiarity with deep learning frameworks such as TensorFlow or PyTorch. Excellent analytical and problem-solving skills. Master's or Ph. D. in Computer Science, Engineering, or a related field.
About the job
Join Waymo as a Senior Machine Learning Engineer focusing on Perception LLM/VLM. In this role, you will leverage cutting-edge machine learning techniques to enhance our autonomous driving technology. You will collaborate with a talented team of engineers and researchers to develop algorithms that improve our perception systems, ensuring safety and efficiency on the road.
About Waymo
Waymo is a leader in self-driving technology, dedicated to making roads safer and more accessible for everyone. With a commitment to innovation, we aim to redefine transportation through our state-of-the-art autonomous vehicles.
Similar jobs
1 - 20 of 5,469 Jobs
Search for Machine Learning Engineer Llm Evaluations And Observability
Join gleanwork as a Machine Learning Engineer specializing in LLM evaluations and observability. In this role, you will be instrumental in developing cutting-edge machine learning systems that enhance our understanding and effectiveness of language learning models. You will collaborate with cross-functional teams to drive the integration of advanced analytics and machine learning solutions.
Full-time|$280K/yr - $380K/yr|On-site|San Francisco, CA; Seattle, WA; New York, NY
At Scale AI, we are the premier partner for data and evaluation in the rapidly evolving field of artificial intelligence. Our commitment to advancing the assessment and benchmarking of large language models (LLMs) positions us at the forefront of AI innovation. We are dedicated to creating leading-edge LLM evaluation methodologies that set new benchmarks for model performance. Our research teams collaborate with the top AI laboratories in the industry to provide high-quality data, accelerate progress in generative AI research, and inform what excellence looks like in this domain. As a Staff Machine Learning Research Scientist on our LLM Evals team, you will spearhead the creation of novel evaluation methodologies, metrics, and benchmarks to assess the strengths and weaknesses of cutting-edge LLMs. Your work will shape our internal strategies and influence the broader AI research community, making this role essential for establishing best practices in data-driven AI development.
Full-time|$280K/yr - $380K/yr|On-site|San Francisco, CA; Seattle, WA; New York, NY
As a premier data and evaluation partner for cutting-edge AI firms, Scale AI is committed to enhancing the evaluation and benchmarking of large language models (LLMs). We are developing industry-leading LLM evaluations that set new benchmarks for model performance assessment. Our mission is to create rigorous, scalable, and equitable evaluation methodologies that propel the next evolution of AI capabilities.Our Research teams collaborate with top AI laboratories to provide high-quality data and expedite advancements in Generative AI research. As the Tech Lead/Manager of the LLM Evaluations Research team, you will guide a skilled team of research scientists and engineers dedicated to crafting and applying innovative evaluation methodologies, metrics, and benchmarks that assess the strengths and weaknesses of our advanced LLMs. This pivotal role involves designing and executing a strategic roadmap that establishes best practices in data-driven AI development, thus accelerating the development of the next generation of generative AI models in collaboration with leading foundational model labs.
Bland Inc. seeks a Machine Learning Researcher specializing in Multimodal Large Language Models (LLMs) to join the team in San Francisco. The focus is on advancing AI systems that integrate language with other types of data. Role overview This position centers on research and development aimed at improving how AI models process and understand information from multiple sources, such as text combined with images or other modalities. What you will do Investigate how language interacts with additional data types within multimodal LLMs Create and evaluate new methods to enhance AI model performance Work closely with colleagues on projects designed to push the boundaries of machine learning Location This role is based in San Francisco.
At Exa, we are pioneering the next generation of search engines designed for the era of artificial intelligence, starting from the foundational Silicon architecture. Our ambitious indexing operation is unparalleled, allowing us to crawl the vast open web at an extraordinary scale. We harness cutting-edge embedding models to comprehend this data and utilize our high-performance Rust-based vector database alongside a $5M H200 GPU cluster, which powers tens of thousands of machines simultaneously.The Machine Learning (ML) division is central to this mission, focusing on the training of foundational models that enhance search capabilities. Our vision is to create systems capable of swiftly filtering the world’s knowledge to deliver precisely what you need, regardless of the complexity of your inquiry—effectively transforming the web into a robust, searchable database.To achieve this ambitious goal, we must define what constitutes “effective search”. This is where your expertise will play a crucial role.We are seeking a talented Machine Learning Evaluations Engineer to develop and implement our evaluation framework at Exa. This position entails exploring methodologies to assess search engines in a world dominated by large language models (LLMs) and crafting the most thorough, innovative, and impactful evaluation suite. Your decisions will influence the future of search optimization and directly affect the research team’s focus, shaping the company’s strategic direction.
Full-time|$200K/yr - $240K/yr|On-site|San Francisco, CA
Join Us in Building a Safer World.At TRM Labs, we specialize in blockchain analytics and AI solutions aimed at assisting law enforcement, national security agencies, financial institutions, and cryptocurrency businesses in identifying, investigating, and preventing crypto-related fraud and financial crime. Our innovative platforms leverage blockchain intelligence and AI technology to trace funds, detect illicit activity, and construct comprehensive threat profiles. Trusted by leading organizations worldwide, TRM Labs is committed to enabling a safer and more secure environment for all.Our AI Engineering Team is dedicated to pioneering next-generation AI applications, particularly in the realm of Large Language Models (LLMs) and agentic systems. Our goal is to develop resilient pipelines and high-performance infrastructure that facilitate the swift, safe, and scalable deployment of AI systems.We manage extensive petabyte-scale pipelines, ensuring model serving with millisecond latency while providing the necessary observability and governance to make AI production-ready. Our team actively evaluates and integrates leading-edge tools in the LLM and agent space, including open-source stacks, vector databases, evaluation frameworks, and orchestration tools to accelerate TRM’s innovation pace.As a Senior or Staff ML Systems Engineer – LLM, you will play a pivotal role in constructing and scaling our technical infrastructure for AI/ML systems. Your responsibilities will include:Creating reusable CI/CD workflows for model training, evaluation, and deployment, integrating tools such as Langfuse, GitHub Actions, and experiment tracking.Automating model versioning, approval processes, and compliance checks across various environments.Developing a modular and scalable AI infrastructure stack that encompasses vector databases, feature stores, model registries, and observability tools.Collaborating with engineering and data science teams to embed AI models and agents into real-time applications and workflows.Continuously assessing and incorporating state-of-the-art AI tools (e.g., LangChain, LlamaIndex, vLLM, MLflow, BentoML).Promoting AI reliability and governance while enabling experimentation, ensuring compliance, security, and continuous uptime.Enhancing AI/ML Model Performance and ensuring data accuracy and consistency, leading to improved model training and inference.Implementing infrastructure to facilitate both offline and online evaluation of LLMs and agents.
Full-time|Hybrid|Mountain View, CA USA; San Francisco, CA USA;
Join Waymo as a Senior Machine Learning Engineer focusing on Perception LLM/VLM. In this role, you will leverage cutting-edge machine learning techniques to enhance our autonomous driving technology. You will collaborate with a talented team of engineers and researchers to develop algorithms that improve our perception systems, ensuring safety and efficiency on the road.
Join Reducto as a Machine Learning Evaluation Engineer where you will play a critical role in assessing and enhancing machine learning models. You will collaborate closely with data scientists and engineers to ensure our systems are efficient and accurate, bringing innovative solutions to challenging problems in the machine learning space.
Full-time|$216.3K/yr - $300.3K/yr|On-site|San Francisco, CA; St. Louis, MO; New York, NY; Washington, DC
Senior Machine Learning Engineer - Model Evaluations for the Public Sector The Public Sector Machine Learning team at Scale AI pioneers the deployment of cutting-edge AI systems, including Large Language Models (LLMs), agentic models, and comprehensive multimodal pipelines, within critical government operations. We establish robust evaluation frameworks that ensure these models function reliably, safely, and effectively in real-world scenarios. As a Senior Machine Learning Engineer, you will architect, implement, and enhance automated evaluation pipelines that empower our clients to trust and effectively utilize advanced AI systems in defense, intelligence, and federal missions. Your Responsibilities Include: Creating and maintaining automated evaluation pipelines for machine learning models, focusing on functional, performance, robustness, and safety metrics, including evaluations based on LLM judges. Designing test datasets and benchmarks to assess generalization, bias, explainability, and potential failure modes. Building evaluation frameworks for LLM agents, which includes the infrastructure for scenario-based and environment-based testing. Conducting comparative analyses of model architectures, training procedures, and evaluation results. Implementing tools for continuous monitoring, regression testing, and quality assurance of machine learning systems. Designing and executing stress tests and red-teaming workflows to identify vulnerabilities and edge cases. Collaborating with operations teams and subject matter experts to generate high-quality evaluation datasets. This position requires an active security clearance or the ability to obtain one.
Join Whatnot as an LLM Platform Engineer where you'll be at the forefront of developing and optimizing cutting-edge language models. In this role, you will collaborate with a dynamic team of engineers and data scientists to enhance our machine learning infrastructure and algorithms. Your contributions will directly impact the efficiency and effectiveness of our language understanding capabilities.
Full-time|$275K/yr - $350K/yr|On-site|San Francisco, CA; Seattle, WA; New York, NY
About Scale AI At Scale AI, we are dedicated to propelling the advancement of AI applications. Over the past eight years, we have established ourselves as the premier AI data foundry, supporting groundbreaking innovations in fields such as generative AI, defense technologies, and autonomous vehicles. Following our recent Series F funding round, we are intensifying our efforts to harness frontier data, paving the way toward achieving Artificial General Intelligence (AGI). Our work with enterprise clients and governments has enhanced our model evaluation capabilities, allowing us to expand our offerings for both public and private evaluations. About the ACE Team The Agent Capabilities & Environments (ACE) team, a vital part of Scale’s Research organization, unites customer-focused Researchers and Applied AI Engineers. Our primary mission is to conduct research on agent environments and reinforcement learning reward signals, benchmark autonomous agent performance in real-world contexts, and develop robust data programs aimed at enhancing the capabilities of Large Language Models (LLMs). We are committed to creating foundational tools and frameworks for evaluating models as agents, focusing on autonomous agents that interact dynamically with a wide range of external environments, including code repositories and GUI interfaces. About This Role This position sits at the cutting edge of AI research and its practical applications, concentrating on the data types necessary for the development of state-of-the-art agents, including browser and software engineering agents. The ideal candidate will investigate the data landscape required to propel intelligent and adaptable AI agents, steering the data strategy at Scale to foster innovation. This role demands not only expertise in LLM agents and planning algorithms but also creative problem-solving skills to tackle novel challenges pertaining to data, interaction, and evaluation. You will contribute to influential research publications on agents, collaborate with customer researchers, and partner with the engineering team to transform these advancements into scalable real-world solutions.
Join Orchard as a Machine Learning Engineer and play a pivotal role in transforming data into actionable insights. In this dynamic position, you will leverage your expertise in machine learning algorithms and data analysis to develop innovative solutions that enhance our products and services.We are looking for a proactive team player who thrives in a fast-paced environment and possesses strong problem-solving skills. You will collaborate with cross-functional teams, engage with large datasets, and contribute to the design and implementation of machine learning models.
Reflection AI develops open-weight models with the goal of making superintelligence broadly accessible. The team draws on backgrounds from DeepMind, OpenAI, Google Brain, Meta, and Anthropic, and serves a wide range of users including individuals, enterprises, and government organizations. Role overview This Machine Learning Engineer position focuses on post-training and evaluation within the Applied AI group in San Francisco. The main responsibility is to fine-tune and evaluate Reflection AI’s open-weight models for enterprise customers, adapting them to specific domains and tasks using real customer data. The work covers the entire process: preparing and cleaning datasets, running fine-tuning workflows, building evaluation systems, and deploying models into production. Collaboration is central, both with clients to understand their needs and with research colleagues to advance model capabilities. What you will do Fine-tune open-weight models for customer use cases, including dataset preparation, configuring training (such as SFT, preference optimization, and reinforcement fine-tuning), and iterating based on evaluation feedback. Design and maintain evaluation infrastructure: create evaluation suites, curate test sets, set baselines, and measure improvements on key customer tasks. Prepare training data from raw customer sources by assessing data quality, cleaning and formatting, identifying noisy or adversarial samples, and building reproducible data pipelines. Troubleshoot training and inference by analyzing loss curves, diagnosing data issues, and identifying problematic training dynamics. Deploy fine-tuned models in hybrid environments (public cloud, VPC, on-premises) to ensure reliable, high-performance inference in production. Contribute to developing playbooks, evaluation benchmarks, and best practices for fine-tuning and evaluation as the team’s approach evolves. Requirements Hands-on experience in applied machine learning, especially fine-tuning language models. This includes preparing datasets, running training loops, evaluating results, and deploying models. Familiarity with SFT, DPO, RLHF, or related techniques is required. Strong understanding of evaluation methods, with the ability to design evaluations, interpret training metrics, and accurately assess model performance. Location San Francisco
Join Handshake as a Machine Learning Engineer I, where you will have the opportunity to work on cutting-edge machine learning projects that drive our innovative solutions. Collaborate with a talented team to develop algorithms and models that enhance our product offerings and improve user experiences.
Join Our Innovative Team at HiveHive is at the forefront of cloud-based AI solutions, revolutionizing how organizations understand, search for, and generate content. Trusted by many of the world's largest and most groundbreaking companies, we empower developers with premier pre-trained AI models that handle billions of API requests monthly. Our turnkey software applications leverage proprietary AI models and datasets, driving transformative advancements in content moderation, brand protection, sponsorship measurement, and context-based ad targeting.With over $120M in funding from prominent investors like General Catalyst, 8VC, Glynn Capital, Bain & Company, and Visa Ventures, Hive is rapidly expanding. Our dynamic team of over 250 employees operates from our San Francisco, Seattle, and Delhi offices. If you are passionate about shaping the future of AI, we invite you to explore opportunities with us!About the Machine Learning Engineer RoleAs we strive to achieve our ambitious vision, we seek exceptional machine learning engineers to join our team. We are looking for enthusiastic developers who are eager to remain at the cutting edge of deep learning technology, designing and deploying state-of-the-art neural network models into production. Our ideal candidates thrive in working with large-scale datasets and demonstrate a keen interest in mastering new technologies across the machine learning spectrum. We value individuals who are proactive and take ownership of their projects, contributing innovative ideas and practical implementations. Experience in building machine learning applications from the ground up and designing scalable, maintainable data pipelines is essential.
Join our dynamic Personalization team at Boomtrain as a Machine Learning Engineer. We are in search of a skilled engineer who will play a pivotal role in developing and enhancing our recommendation systems that cater to a variety of customers.In this role, you will collaborate with a talented team dedicated to designing and implementing innovative models and systems that deliver personalized recommendations. You will have the opportunity to work on complex engineering challenges and contribute to generating hundreds of millions of recommendations daily.This position offers a unique chance to engage in end-to-end project work and make a significant impact on our personalization initiatives.Key Responsibilities:Research and propose advanced recommendation and optimization models to enhance our personalization systems.Develop and maintain offline model generation pipelines.Design and maintain online recommendation serving systems.
Full-time|$252K/yr - $315K/yr|On-site|San Francisco, CA; New York, NY
Join Scale's innovative Large Language Model (LLM) post-training platform team, where you will contribute to the development of our internal distributed framework designed specifically for LLM training. This sophisticated platform empowers Machine Learning Engineers (MLEs), researchers, data scientists, and operators to perform rapid and automated training and evaluation of LLMs. Additionally, it underpins the training framework for our data quality evaluation pipeline.Scale is at the forefront of the Artificial Intelligence sector, acting as a vital provider of training and evaluation data, as well as comprehensive solutions for the entire machine learning lifecycle. In this role, you will collaborate closely with Scale’s ML teams and researchers to construct the foundational platform that supports all our ML research and development initiatives. Your work will involve building and optimizing this platform to facilitate the training, inference, and data curation of next-generation LLMs.If you are passionate about driving the future of AI through groundbreaking innovations, we invite you to connect with us!
Full-time|$252K/yr - $315K/yr|On-site|San Francisco, CA; Seattle, WA; New York, NY
At Scale AI, we collaborate with leading AI laboratories to supply high-quality data and foster advancements in Generative AI research. We seek innovative Research Scientists and Research Engineers with a strong focus on post-training techniques for Large Language Models (LLMs), including Supervised Fine-Tuning (SFT), Reinforcement Learning from Human Feedback (RLHF), and reward modeling. This position emphasizes optimizing data curation and evaluation processes to boost LLM performance across text and multimodal formats. In this pivotal role, you will pioneer new methods to enhance the alignment and generalization of extensive generative models. You will work closely with fellow researchers and engineers to establish best practices in data-driven AI development. Additionally, you will collaborate with top foundation model labs, providing critical technical and strategic insights for the evolution of next-generation generative AI models.
OverviewPulse is revolutionizing data infrastructure by addressing the critical challenge of extracting accurate, structured information from complex documents on a large scale. Our innovative approach to document understanding integrates intelligent schema mapping with advanced extraction models, outperforming traditional OCR and parsing methods.As a dynamic and rapidly growing team of engineers based in San Francisco, we empower Fortune 100 companies, Y Combinator startups, public investment firms, and growth-oriented businesses. With the backing of top-tier investors, we are on an exciting growth trajectory.What sets our technology apart is our cutting-edge multi-stage architecture:Layout comprehension with specialized component detection modelsLow-latency OCR models designed for targeted data extractionAdvanced algorithms for determining reading order in complex formatsProprietary table structure recognition and parsing capabilitiesFine-tuned vision-language models for interpreting charts, tables, and figuresIf you are passionate about the convergence of computer vision, natural language processing, and data infrastructure, your contributions at Pulse will directly influence our customers and shape the future of document intelligence.
Join Handshake as an Associate Machine Learning Engineer and embark on an exciting journey in the world of artificial intelligence and machine learning. In this role, you will collaborate with a talented team to develop innovative solutions that leverage cutting-edge technologies. You'll have the opportunity to contribute to real-world projects, enhancing your skills while driving impactful results.
Apr 2, 2026
Sign in to browse more jobs
Create account — see all 5,469 results
Tailoring 0 resumes…
Tailoring 0 resumes…
We'll move completed jobs to Ready to Apply automatically.