Senior Machine Learning Engineer - Perception LLM/VLM

WaymoMountain View, CA USA; San Francisco, CA USA;

Hybrid Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.

Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Senior

Qualifications

Proven experience in machine learning, particularly in perception systems. Strong programming skills in Python and C++. Experience with large datasets and data processing frameworks. Familiarity with deep learning frameworks such as TensorFlow or PyTorch. Excellent analytical and problem-solving skills. Master's or Ph. D. in Computer Science, Engineering, or a related field.

About the job

Join Waymo as a Senior Machine Learning Engineer focusing on Perception LLM/VLM. In this role, you will leverage cutting-edge machine learning techniques to enhance our autonomous driving technology. You will collaborate with a talented team of engineers and researchers to develop algorithms that improve our perception systems, ensuring safety and efficiency on the road.

About Waymo

Waymo is a leader in self-driving technology, dedicated to making roads safer and more accessible for everyone. With a commitment to innovation, we aim to redefine transportation through our state-of-the-art autonomous vehicles.

Similar jobs

1 - 20 of 5,469 Jobs

Search for Machine Learning Engineer Llm Evaluations And Observability

5,469 results

Select all on this page (20)

Apply

Machine Learning Engineer - LLM Evaluations and Observability

gleanwork

Full-time|Remote|San Francisco Bay Area

Join gleanwork as a Machine Learning Engineer specializing in LLM evaluations and observability. In this role, you will be instrumental in developing cutting-edge machine learning systems that enhance our understanding and effectiveness of language learning models. You will collaborate with cross-functional teams to drive the integration of advanced analytics and machine learning solutions.

Mar 16, 2026

Apply

Staff Machine Learning Research Scientist - LLM Evaluations

Scale AI

Full-time|$280K/yr - $380K/yr|On-site|San Francisco, CA; Seattle, WA; New York, NY

At Scale AI, we are the premier partner for data and evaluation in the rapidly evolving field of artificial intelligence. Our commitment to advancing the assessment and benchmarking of large language models (LLMs) positions us at the forefront of AI innovation. We are dedicated to creating leading-edge LLM evaluation methodologies that set new benchmarks for model performance. Our research teams collaborate with the top AI laboratories in the industry to provide high-quality data, accelerate progress in generative AI research, and inform what excellence looks like in this domain. As a Staff Machine Learning Research Scientist on our LLM Evals team, you will spearhead the creation of novel evaluation methodologies, metrics, and benchmarks to assess the strengths and weaknesses of cutting-edge LLMs. Your work will shape our internal strategies and influence the broader AI research community, making this role essential for establishing best practices in data-driven AI development.

Mar 26, 2026

Apply

Tech Lead/Manager, Machine Learning Research Scientist - LLM Evaluations

Scale AI, Inc.

Full-time|$280K/yr - $380K/yr|On-site|San Francisco, CA; Seattle, WA; New York, NY

As a premier data and evaluation partner for cutting-edge AI firms, Scale AI is committed to enhancing the evaluation and benchmarking of large language models (LLMs). We are developing industry-leading LLM evaluations that set new benchmarks for model performance assessment. Our mission is to create rigorous, scalable, and equitable evaluation methodologies that propel the next evolution of AI capabilities.Our Research teams collaborate with top AI laboratories to provide high-quality data and expedite advancements in Generative AI research. As the Tech Lead/Manager of the LLM Evaluations Research team, you will guide a skilled team of research scientists and engineers dedicated to crafting and applying innovative evaluation methodologies, metrics, and benchmarks that assess the strengths and weaknesses of our advanced LLMs. This pivotal role involves designing and executing a strategic roadmap that establishes best practices in data-driven AI development, thus accelerating the development of the next generation of generative AI models in collaboration with leading foundational model labs.

Mar 26, 2026

Apply

Machine Learning Researcher - Multimodal LLMs

Bland Inc.

Full-time|On-site|San Francisco

Bland Inc. seeks a Machine Learning Researcher specializing in Multimodal Large Language Models (LLMs) to join the team in San Francisco. The focus is on advancing AI systems that integrate language with other types of data. Role overview This position centers on research and development aimed at improving how AI models process and understand information from multiple sources, such as text combined with images or other modalities. What you will do Investigate how language interacts with additional data types within multimodal LLMs Create and evaluate new methods to enhance AI model performance Work closely with colleagues on projects designed to push the boundaries of machine learning Location This role is based in San Francisco.

Apr 21, 2026

Apply

Machine Learning Evaluations Engineer

Exa

Full-time|On-site|San Francisco, California

At Exa, we are pioneering the next generation of search engines designed for the era of artificial intelligence, starting from the foundational Silicon architecture. Our ambitious indexing operation is unparalleled, allowing us to crawl the vast open web at an extraordinary scale. We harness cutting-edge embedding models to comprehend this data and utilize our high-performance Rust-based vector database alongside a $5M H200 GPU cluster, which powers tens of thousands of machines simultaneously.The Machine Learning (ML) division is central to this mission, focusing on the training of foundational models that enhance search capabilities. Our vision is to create systems capable of swiftly filtering the world’s knowledge to deliver precisely what you need, regardless of the complexity of your inquiry—effectively transforming the web into a robust, searchable database.To achieve this ambitious goal, we must define what constitutes “effective search”. This is where your expertise will play a crucial role.We are seeking a talented Machine Learning Evaluations Engineer to develop and implement our evaluation framework at Exa. This position entails exploring methodologies to assess search engines in a world dominated by large language models (LLMs) and crafting the most thorough, innovative, and impactful evaluation suite. Your decisions will influence the future of search optimization and directly affect the research team’s focus, shaping the company’s strategic direction.

Oct 15, 2025

Apply

Senior or Staff Machine Learning Systems Engineer – LLMs

TRM Labs

Full-time|$200K/yr - $240K/yr|On-site|San Francisco, CA

Join Us in Building a Safer World.At TRM Labs, we specialize in blockchain analytics and AI solutions aimed at assisting law enforcement, national security agencies, financial institutions, and cryptocurrency businesses in identifying, investigating, and preventing crypto-related fraud and financial crime. Our innovative platforms leverage blockchain intelligence and AI technology to trace funds, detect illicit activity, and construct comprehensive threat profiles. Trusted by leading organizations worldwide, TRM Labs is committed to enabling a safer and more secure environment for all.Our AI Engineering Team is dedicated to pioneering next-generation AI applications, particularly in the realm of Large Language Models (LLMs) and agentic systems. Our goal is to develop resilient pipelines and high-performance infrastructure that facilitate the swift, safe, and scalable deployment of AI systems.We manage extensive petabyte-scale pipelines, ensuring model serving with millisecond latency while providing the necessary observability and governance to make AI production-ready. Our team actively evaluates and integrates leading-edge tools in the LLM and agent space, including open-source stacks, vector databases, evaluation frameworks, and orchestration tools to accelerate TRM’s innovation pace.As a Senior or Staff ML Systems Engineer – LLM, you will play a pivotal role in constructing and scaling our technical infrastructure for AI/ML systems. Your responsibilities will include:Creating reusable CI/CD workflows for model training, evaluation, and deployment, integrating tools such as Langfuse, GitHub Actions, and experiment tracking.Automating model versioning, approval processes, and compliance checks across various environments.Developing a modular and scalable AI infrastructure stack that encompasses vector databases, feature stores, model registries, and observability tools.Collaborating with engineering and data science teams to embed AI models and agents into real-time applications and workflows.Continuously assessing and incorporating state-of-the-art AI tools (e.g., LangChain, LlamaIndex, vLLM, MLflow, BentoML).Promoting AI reliability and governance while enabling experimentation, ensuring compliance, security, and continuous uptime.Enhancing AI/ML Model Performance and ensuring data accuracy and consistency, leading to improved model training and inference.Implementing infrastructure to facilitate both offline and online evaluation of LLMs and agents.

Mar 12, 2026

Apply

Senior Machine Learning Engineer - Perception LLM/VLM

Waymo

Full-time|Hybrid|Mountain View, CA USA; San Francisco, CA USA;

Mar 12, 2026

Apply

Machine Learning Evaluation Engineer

Reducto

Full-time|On-site|San Francisco Office

Join Reducto as a Machine Learning Evaluation Engineer where you will play a critical role in assessing and enhancing machine learning models. You will collaborate closely with data scientists and engineers to ensure our systems are efficient and accurate, bringing innovative solutions to challenging problems in the machine learning space.

Mar 16, 2026

Apply

Senior Machine Learning Engineer - Model Evaluations for Public Sector

Scale AI

Full-time|$216.3K/yr - $300.3K/yr|On-site|San Francisco, CA; St. Louis, MO; New York, NY; Washington, DC

Senior Machine Learning Engineer - Model Evaluations for the Public Sector The Public Sector Machine Learning team at Scale AI pioneers the deployment of cutting-edge AI systems, including Large Language Models (LLMs), agentic models, and comprehensive multimodal pipelines, within critical government operations. We establish robust evaluation frameworks that ensure these models function reliably, safely, and effectively in real-world scenarios. As a Senior Machine Learning Engineer, you will architect, implement, and enhance automated evaluation pipelines that empower our clients to trust and effectively utilize advanced AI systems in defense, intelligence, and federal missions. Your Responsibilities Include: Creating and maintaining automated evaluation pipelines for machine learning models, focusing on functional, performance, robustness, and safety metrics, including evaluations based on LLM judges. Designing test datasets and benchmarks to assess generalization, bias, explainability, and potential failure modes. Building evaluation frameworks for LLM agents, which includes the infrastructure for scenario-based and environment-based testing. Conducting comparative analyses of model architectures, training procedures, and evaluation results. Implementing tools for continuous monitoring, regression testing, and quality assurance of machine learning systems. Designing and executing stress tests and red-teaming workflows to identify vulnerabilities and edge cases. Collaborating with operations teams and subject matter experts to generate high-quality evaluation datasets. This position requires an active security clearance or the ability to obtain one.

Mar 26, 2026

Apply

LLM Platform Engineer

Whatnot

Full-time|On-site|San Francisco, CA

Join Whatnot as an LLM Platform Engineer where you'll be at the forefront of developing and optimizing cutting-edge language models. In this role, you will collaborate with a dynamic team of engineers and data scientists to enhance our machine learning infrastructure and algorithms. Your contributions will directly impact the efficiency and effectiveness of our language understanding capabilities.

Mar 3, 2026

Apply

Staff Machine Learning Research Scientist/Engineer, Agents

Scale AI

Full-time|$275K/yr - $350K/yr|On-site|San Francisco, CA; Seattle, WA; New York, NY

About Scale AI At Scale AI, we are dedicated to propelling the advancement of AI applications. Over the past eight years, we have established ourselves as the premier AI data foundry, supporting groundbreaking innovations in fields such as generative AI, defense technologies, and autonomous vehicles. Following our recent Series F funding round, we are intensifying our efforts to harness frontier data, paving the way toward achieving Artificial General Intelligence (AGI). Our work with enterprise clients and governments has enhanced our model evaluation capabilities, allowing us to expand our offerings for both public and private evaluations. About the ACE Team The Agent Capabilities & Environments (ACE) team, a vital part of Scale’s Research organization, unites customer-focused Researchers and Applied AI Engineers. Our primary mission is to conduct research on agent environments and reinforcement learning reward signals, benchmark autonomous agent performance in real-world contexts, and develop robust data programs aimed at enhancing the capabilities of Large Language Models (LLMs). We are committed to creating foundational tools and frameworks for evaluating models as agents, focusing on autonomous agents that interact dynamically with a wide range of external environments, including code repositories and GUI interfaces. About This Role This position sits at the cutting edge of AI research and its practical applications, concentrating on the data types necessary for the development of state-of-the-art agents, including browser and software engineering agents. The ideal candidate will investigate the data landscape required to propel intelligent and adaptable AI agents, steering the data strategy at Scale to foster innovation. This role demands not only expertise in LLM agents and planning algorithms but also creative problem-solving skills to tackle novel challenges pertaining to data, interaction, and evaluation. You will contribute to influential research publications on agents, collaborate with customer researchers, and partner with the engineering team to transform these advancements into scalable real-world solutions.

Mar 26, 2026

Apply

Machine Learning Engineer

Orchard

Full-time|On-site|San Francisco

Join Orchard as a Machine Learning Engineer and play a pivotal role in transforming data into actionable insights. In this dynamic position, you will leverage your expertise in machine learning algorithms and data analysis to develop innovative solutions that enhance our products and services.We are looking for a proactive team player who thrives in a fast-paced environment and possesses strong problem-solving skills. You will collaborate with cross-functional teams, engage with large datasets, and contribute to the design and implementation of machine learning models.

Mar 14, 2026

Apply

Machine Learning Engineer - Post-Training and Evaluation

Reflection AI

Full-time|On-site|San Francisco

Reflection AI develops open-weight models with the goal of making superintelligence broadly accessible. The team draws on backgrounds from DeepMind, OpenAI, Google Brain, Meta, and Anthropic, and serves a wide range of users including individuals, enterprises, and government organizations. Role overview This Machine Learning Engineer position focuses on post-training and evaluation within the Applied AI group in San Francisco. The main responsibility is to fine-tune and evaluate Reflection AI’s open-weight models for enterprise customers, adapting them to specific domains and tasks using real customer data. The work covers the entire process: preparing and cleaning datasets, running fine-tuning workflows, building evaluation systems, and deploying models into production. Collaboration is central, both with clients to understand their needs and with research colleagues to advance model capabilities. What you will do Fine-tune open-weight models for customer use cases, including dataset preparation, configuring training (such as SFT, preference optimization, and reinforcement fine-tuning), and iterating based on evaluation feedback. Design and maintain evaluation infrastructure: create evaluation suites, curate test sets, set baselines, and measure improvements on key customer tasks. Prepare training data from raw customer sources by assessing data quality, cleaning and formatting, identifying noisy or adversarial samples, and building reproducible data pipelines. Troubleshoot training and inference by analyzing loss curves, diagnosing data issues, and identifying problematic training dynamics. Deploy fine-tuned models in hybrid environments (public cloud, VPC, on-premises) to ensure reliable, high-performance inference in production. Contribute to developing playbooks, evaluation benchmarks, and best practices for fine-tuning and evaluation as the team’s approach evolves. Requirements Hands-on experience in applied machine learning, especially fine-tuning language models. This includes preparing datasets, running training loops, evaluating results, and deploying models. Familiarity with SFT, DPO, RLHF, or related techniques is required. Strong understanding of evaluation methods, with the ability to design evaluations, interpret training metrics, and accurately assess model performance. Location San Francisco

Apr 22, 2026

Apply

Machine Learning Engineer I

Handshake

Full-time|On-site|San Francisco, CA

Join Handshake as a Machine Learning Engineer I, where you will have the opportunity to work on cutting-edge machine learning projects that drive our innovative solutions. Collaborate with a talented team to develop algorithms and models that enhance our product offerings and improve user experiences.

Apr 6, 2026

Apply

Machine Learning Engineer

Hive

Full-time|On-site|San Francisco

Join Our Innovative Team at HiveHive is at the forefront of cloud-based AI solutions, revolutionizing how organizations understand, search for, and generate content. Trusted by many of the world's largest and most groundbreaking companies, we empower developers with premier pre-trained AI models that handle billions of API requests monthly. Our turnkey software applications leverage proprietary AI models and datasets, driving transformative advancements in content moderation, brand protection, sponsorship measurement, and context-based ad targeting.With over $120M in funding from prominent investors like General Catalyst, 8VC, Glynn Capital, Bain & Company, and Visa Ventures, Hive is rapidly expanding. Our dynamic team of over 250 employees operates from our San Francisco, Seattle, and Delhi offices. If you are passionate about shaping the future of AI, we invite you to explore opportunities with us!About the Machine Learning Engineer RoleAs we strive to achieve our ambitious vision, we seek exceptional machine learning engineers to join our team. We are looking for enthusiastic developers who are eager to remain at the cutting edge of deep learning technology, designing and deploying state-of-the-art neural network models into production. Our ideal candidates thrive in working with large-scale datasets and demonstrate a keen interest in mastering new technologies across the machine learning spectrum. We value individuals who are proactive and take ownership of their projects, contributing innovative ideas and practical implementations. Experience in building machine learning applications from the ground up and designing scalable, maintainable data pipelines is essential.

Jan 15, 2021

Apply

Machine Learning Engineer

Boomtrain

Full-time|On-site|San Francisco

Join our dynamic Personalization team at Boomtrain as a Machine Learning Engineer. We are in search of a skilled engineer who will play a pivotal role in developing and enhancing our recommendation systems that cater to a variety of customers.In this role, you will collaborate with a talented team dedicated to designing and implementing innovative models and systems that deliver personalized recommendations. You will have the opportunity to work on complex engineering challenges and contribute to generating hundreds of millions of recommendations daily.This position offers a unique chance to engage in end-to-end project work and make a significant impact on our personalization initiatives.Key Responsibilities:Research and propose advanced recommendation and optimization models to enhance our personalization systems.Develop and maintain offline model generation pipelines.Design and maintain online recommendation serving systems.

Jul 21, 2016

Apply

Tech Lead Manager - Machine Learning Research and Engineering

Scale AI

Full-time|$252K/yr - $315K/yr|On-site|San Francisco, CA; New York, NY

Join Scale's innovative Large Language Model (LLM) post-training platform team, where you will contribute to the development of our internal distributed framework designed specifically for LLM training. This sophisticated platform empowers Machine Learning Engineers (MLEs), researchers, data scientists, and operators to perform rapid and automated training and evaluation of LLMs. Additionally, it underpins the training framework for our data quality evaluation pipeline.Scale is at the forefront of the Artificial Intelligence sector, acting as a vital provider of training and evaluation data, as well as comprehensive solutions for the entire machine learning lifecycle. In this role, you will collaborate closely with Scale’s ML teams and researchers to construct the foundational platform that supports all our ML research and development initiatives. Your work will involve building and optimizing this platform to facilitate the training, inference, and data curation of next-generation LLMs.If you are passionate about driving the future of AI through groundbreaking innovations, we invite you to connect with us!

Mar 26, 2026

Apply

Machine Learning Research Scientist / Research Engineer - Post-Training

Scale AI

Full-time|$252K/yr - $315K/yr|On-site|San Francisco, CA; Seattle, WA; New York, NY

At Scale AI, we collaborate with leading AI laboratories to supply high-quality data and foster advancements in Generative AI research. We seek innovative Research Scientists and Research Engineers with a strong focus on post-training techniques for Large Language Models (LLMs), including Supervised Fine-Tuning (SFT), Reinforcement Learning from Human Feedback (RLHF), and reward modeling. This position emphasizes optimizing data curation and evaluation processes to boost LLM performance across text and multimodal formats. In this pivotal role, you will pioneer new methods to enhance the alignment and generalization of extensive generative models. You will work closely with fellow researchers and engineers to establish best practices in data-driven AI development. Additionally, you will collaborate with top foundation model labs, providing critical technical and strategic insights for the evolution of next-generation generative AI models.

Mar 26, 2026

Apply

Machine Learning Engineer

Pulse

Full-time|On-site|San Francisco

OverviewPulse is revolutionizing data infrastructure by addressing the critical challenge of extracting accurate, structured information from complex documents on a large scale. Our innovative approach to document understanding integrates intelligent schema mapping with advanced extraction models, outperforming traditional OCR and parsing methods.As a dynamic and rapidly growing team of engineers based in San Francisco, we empower Fortune 100 companies, Y Combinator startups, public investment firms, and growth-oriented businesses. With the backing of top-tier investors, we are on an exciting growth trajectory.What sets our technology apart is our cutting-edge multi-stage architecture:Layout comprehension with specialized component detection modelsLow-latency OCR models designed for targeted data extractionAdvanced algorithms for determining reading order in complex formatsProprietary table structure recognition and parsing capabilitiesFine-tuned vision-language models for interpreting charts, tables, and figuresIf you are passionate about the convergence of computer vision, natural language processing, and data infrastructure, your contributions at Pulse will directly influence our customers and shape the future of document intelligence.

Jul 30, 2025

Apply

Associate Machine Learning Engineer

Handshake

Full-time|On-site|San Francisco, CA

Join Handshake as an Associate Machine Learning Engineer and embark on an exciting journey in the world of artificial intelligence and machine learning. In this role, you will collaborate with a talented team to develop innovative solutions that leverage cutting-edge technologies. You'll have the opportunity to contribute to real-world projects, enhancing your skills while driving impactful results.

Apr 2, 2026

Create account — see all 5,469 results