Machine Learning Infrastructure Software Engineer jobs in San Mateo – Browse 472 openings on RoboApply Jobs

Software Engineer: Machine Learning Infrastructure

GeneralistSan Francisco Bay Area (San Mateo) or Boston (Somerville)

On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.

Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Mid to Senior

Qualifications

Candidates should possess a robust understanding of machine learning frameworks, experience with high-performance computing environments, and a passion for robotics and AI technologies. Strong problem-solving skills and the ability to work collaboratively in a fast-paced environment are essential.

About the job

About the Role

At Generalist, we are at the forefront of training expansive robot foundation models, leveraging cutting-edge GPU hardware, primarily from Nvidia, to execute distributed training tasks and experimental research. Our operations demand exceptional storage solutions and optimized data loading processes, necessitating the full utilization of cloud infrastructure alongside custom-built solutions.

In this role, you will take charge of our inference infrastructure. Our robotic systems rely on a dedicated fleet of on-premises GPUs designed for demanding real-time computations and latency-sensitive applications within resource-constrained environments.

Your Responsibilities:

Manage and optimize our GPU compute fleets.
Facilitate user-friendly access to GPUs for researchers, ensuring optimal utilization.
Enhance ML data loading, transport, and storage systems in extensively utilized distributed environments.
Oversee the orchestration of our robot inference fleets.

You May Excel in This Position If You:

Have experience managing large GPU fleets for large-scale, distributed training or inference.
Possess significant expertise in using Slurm or Kubernetes for ML workload orchestration.
Have developed high-scale ML data loaders and preparation systems.
Understand the intricacies of ML hardware, storage, and networking systems.
Are familiar with the Nvidia GPU ecosystem.

About Generalist

Generalist is dedicated to transforming the future with general-purpose robotics. We envision a world where humans and machines collaborate seamlessly to enhance productivity and innovation. Our focus on developing embodied foundation models, particularly in dexterity, drives us to push the boundaries of data, modeling, and hardware, enabling robots to navigate and interact intelligently with their environments. Our team comprises experts from leading organizations such as OpenAI, Boston Dynamics, and Google DeepMind, all united by a commitment to advancing AI technologies. We have a proven track record of delivering significant AI innovations and are excited to continue this journey at Generalist.

Similar jobs

1 - 20 of 472 Jobs

Select all on this page (20)

Apply

Software Engineer: Machine Learning Infrastructure

Generalist

Full-time|On-site|San Francisco Bay Area (San Mateo) or Boston (Somerville)

About the RoleAt Generalist, we are at the forefront of training expansive robot foundation models, leveraging cutting-edge GPU hardware, primarily from Nvidia, to execute distributed training tasks and experimental research. Our operations demand exceptional storage solutions and optimized data loading processes, necessitating the full utilization of cloud infrastructure alongside custom-built solutions.In this role, you will take charge of our inference infrastructure. Our robotic systems rely on a dedicated fleet of on-premises GPUs designed for demanding real-time computations and latency-sensitive applications within resource-constrained environments.Your Responsibilities:Manage and optimize our GPU compute fleets.Facilitate user-friendly access to GPUs for researchers, ensuring optimal utilization.Enhance ML data loading, transport, and storage systems in extensively utilized distributed environments.Oversee the orchestration of our robot inference fleets.You May Excel in This Position If You:Have experience managing large GPU fleets for large-scale, distributed training or inference.Possess significant expertise in using Slurm or Kubernetes for ML workload orchestration.Have developed high-scale ML data loaders and preparation systems.Understand the intricacies of ML hardware, storage, and networking systems.Are familiar with the Nvidia GPU ecosystem.

Feb 12, 2026

Apply

Machine Learning Infrastructure Software Engineer

Genesis Therapeutics

Full-time|On-site|San Mateo, CA

Join Our Innovative TeamAt Genesis Therapeutics, we are a dynamic and passionate group of drug discovery experts, deep learning researchers, and software engineers dedicated to revolutionizing biochemistry through AI. Our mission is clear: to uncover and develop transformative therapies for patients with severe medical conditions.Our AI team is at the forefront of creating foundational models for small molecule drug discovery. We conduct cutting-edge research that bridges machine learning, physics, and computational chemistry, while building resilient software systems capable of executing large-scale simulations and training advanced generative and predictive AI models utilizing our powerful cluster of thousands of GPUs and tens of thousands of CPUs.Your RoleWe are on the lookout for skilled ML infrastructure engineers to propel our machine learning research initiatives, particularly in generative modeling of molecular systems, which is vital to our overarching goals.In this position, you will spearhead the rapid advancement of our AI platform and infrastructure, enhancing performance, efficiency, and scalability to unprecedented levels. You will construct expansive distributed training and inference pipelines, essential MLOps tools and frameworks, and fine-tune GPU operations to accelerate ML model performance.Genesis fosters a collaborative and interdisciplinary environment, allowing you to work closely with our talented engineers, researchers, and scientists.Your ResponsibilitiesDrive engineering initiatives aimed at the continuous enhancement of our AI platform, focusing on the rapid development of scalable and robust distributed infrastructures for ML training, inference, and evaluation.Facilitate model training and deployment across various clusters and cloud environments, optimizing for throughput and cost-effectiveness.Maximize the efficiency of ML models and other workloads in terms of latency, throughput, and memory usage, particularly through GPU performance engineering, pushing the boundaries of current hardware capabilities.Contribute to the long-term strategic vision for Genesis’ infrastructure platform.Your ProfileA strong engineering background with a focus on machine learning infrastructure.

Nov 24, 2025

Apply

Software Engineer - Machine Learning Infrastructure

Genesis Molecular AI

Full-time|On-site|San Mateo, CA

About Our TeamAt Genesis Molecular AI, we are a dedicated group of innovative drug hunters, deep learning researchers, and skilled software engineers, all striving to revolutionize biochemistry. Our mission is to harness the power of AI to discover and develop groundbreaking therapies for patients dealing with severe disorders.Our team is engaged in pioneering the development of foundational models for small molecule drug discovery, conducting essential research at the intersection of machine learning, physics, and computational chemistry. We engineer robust software systems that support extensive simulations and facilitate the training of generative and predictive AI models capable of learning from diverse molecular data, leveraging a powerful cluster of thousands of GPUs and tens of thousands of CPUs.About the PositionWe are in search of talented ML Infrastructure Engineers who are ready to take the lead in advancing our machine learning research agenda for generative modeling of molecular systems, a key component of our mission.In this role, you will spearhead the rapid evolution of our AI platform and infrastructure, unlocking unprecedented levels of performance, efficiency, and scalability. You will be responsible for constructing massively distributed training and inference pipelines, core MLOps tools and frameworks, as well as optimizing GPU operations to enhance the speed of ML models.Genesis fosters a collaborative and cross-functional environment, offering the opportunity to work closely with our exceptional engineers, researchers, and scientists.Your ResponsibilitiesLead engineering initiatives aimed at the continuous enhancement of our AI platform, focusing on the rapid development of scalable and resilient distributed infrastructure for ML training, inference, and evaluation.Facilitate model training and deployment across multiple clusters and cloud environments, with an emphasis on optimizing throughput and cost-effectiveness.Maximize the efficiency of ML models and various workloads regarding latency, throughput, and memory usage (e.g., through GPU performance engineering), pushing the limits of current hardware capabilities.Contribute to the long-term strategic vision for Genesis’ infrastructure platform.QualificationsWe are looking for engineers with a strong foundation in software development and a passion for machine learning, particularly in the context of infrastructure and distributed systems.

Nov 24, 2025

Apply

Senior Software Engineer - Machine Learning Infrastructure

Roblox Corporation

Full-time|On-site|San Mateo, CA, United States

Join Roblox as a Senior Software Engineer, specializing in Machine Learning Infrastructure. In this role, you will be a key player in designing and developing robust ML systems to enhance our gaming platform. You will collaborate with cross-functional teams to implement scalable solutions that elevate user experiences. Your expertise will directly influence the future of gaming at Roblox.

Apr 1, 2026

Apply

Senior Machine Learning Infrastructure Engineer, Ads

Roblox

Full-time|$242.1K/yr - $293.8K/yr|On-site|San Mateo, CA, United States

Every day, millions of users engage with Roblox, diving into immersive 3D experiences crafted by our vibrant global community of developers and creators.At Roblox, we empower our community with innovative tools and platforms that bring their imaginations to life. Our mission is to transform how people connect from anywhere, on any device, fostering a world of optimism and civility. We are seeking exceptional talent to help us reach our goal of connecting a billion people.Embarking on a career at Roblox means shaping the future of human interaction, tackling unique technical challenges at scale, and creating safer, more respectful shared experiences for all.As our Ads business rapidly expands, we are developing a large-scale machine learning infrastructure aimed at delivering impactful performance ads to our user base and substantial value to our advertisers.In the role of Senior Machine Learning Infrastructure Engineer within our Ads ML Infra team, you will design and implement scalable, reliable, and high-performance infrastructure that underpins machine learning systems throughout our organization. You will work with vast datasets, handling hundreds of billions of engagements while redefining how we deliver performance ads to millions.

Feb 10, 2026

Apply

Core Infrastructure Software Engineer

Genesis Therapeutics

Full-time|On-site|San Mateo, CA

At Genesis Therapeutics, we are on a mission to revolutionize drug discovery by harnessing the power of machine learning, biophysical simulation, and computational chemistry. We are assembling a top-tier computational team and seek a passionate Infrastructure Engineer to contribute to the development of innovative medicines while playing a pivotal role in enhancing our AI platform.Your ResponsibilitiesCollaborate with the infrastructure team to sustain and expand our multi-cloud compute infrastructure that underpins ML model training, computational chemistry research, and ongoing drug discovery initiatives.Develop configuration and procedures for monitoring, resource allocation, and deployment automation to scale our autoscaling compute clusters for larger workloads.Enhance our orchestration scheduling framework to boost execution throughput, reliability, and compute utilization across diverse pipelines.Your QualificationsA minimum of 5 years of experience in building and maintaining large-scale cloud infrastructure, preferably in AWS or GCP.Strong proficiency in Python, Bash, Terraform, Ray, and Kubernetes.Experience in constructing and maintaining compute clusters for distributed ML training jobs utilizing 1,000+ GPUs is highly desirable.Hands-on experience with physical hardware and datacenter management is a plus.What We OfferAn opportunity to work on impactful infrastructure that accelerates the discovery of new medicines.Join a world-class, close-knit team of dedicated professionals across software, machine learning, computational chemistry, medicinal chemistry, and biology.Competitive salary and equity, along with comprehensive medical, dental, and vision insurance, and a 401(k) program.

Jan 27, 2026

Apply

Software Engineer - NLP/Machine Learning

WisdomAI

Full-time|On-site|San Mateo

About WisdomAIAt WisdomAI, we are committed to democratizing access to data and insights for everyone. We leverage the transformative potential of Generative AI to revolutionize how organizations utilize data for informed decision-making. Join us now, as we partner with multiple Fortune 100 companies and embark on an exciting journey to shape the future of data accessibility and understanding.Position OverviewWe are on the lookout for talented Machine Learning and Natural Language Processing engineers to join our dynamic team. You will tackle core challenges in large language model (LLM) code generation, document comprehension, context synthesis, and the development of LLM-based agents. Our team is comprised of seasoned professionals who hail from industry leaders such as Glean, Apple, Rubrik, Google, and Meta.Key ResponsibilitiesDevelop workflows and agentic pipelines to handle complex analytical tasks.Design and implement an evaluation framework for our query processing engine.Collaborate closely with customers and founders to establish the product and engineering roadmap.Influence the engineering culture and technology stack at WisdomAI.

Feb 3, 2026

Apply

Infrastructure Software Engineer at Genesis Molecular AI | San Mateo

Genesis Therapeutics

Full-time|On-site|San Mateo, CA

At Genesis Therapeutics, we are at the forefront of revolutionizing drug discovery by harnessing the power of machine learning, biophysical simulations, and computational chemistry. We are actively seeking a passionate Infrastructure Engineer to join our elite computational team. In this role, you will contribute to the development of groundbreaking medicines and play a pivotal part in the expansion of our advanced AI platform.Your Role:Collaborate with our infrastructure team to enhance and maintain our extensive multi-cloud compute infrastructure, which is vital for ML model training, computational chemistry research, and drug discovery initiatives.Develop and implement configurations and procedures for monitoring, resource allocation, and deployment automation to adapt to the growing demands of our autoscaling compute clusters.Contribute to the orchestration scheduling framework to boost execution throughput, increase reliability, and optimize compute utilization across diverse pipelines.Your Qualifications:Minimum of 5 years of experience in building and maintaining scalable cloud infrastructure, particularly with AWS or GCP.Proficient in Python, Bash, Terraform, Ray, and Kubernetes.Experience with distributed ML training jobs on compute clusters with over 1,000 GPUs is highly desirable.Hands-on experience with physical hardware and data center management is a plus.What We Offer:An opportunity to engage with impactful infrastructure that accelerates the discovery of new medicines.Be part of a world-class, close-knit team of dedicated professionals across software, machine learning, computational chemistry, medicinal chemistry, and biology.Competitive salary and equity options, alongside comprehensive medical, dental, and vision coverage, plus a 401(k) retirement plan.

Jan 27, 2026

Apply

Applied Machine Learning Engineer

Fireworks AI

Full-time|$170K/yr - $240K/yr|On-site|New York, NY; San Mateo, CA

About Us:At Fireworks AI, we are pioneering the future of generative AI infrastructure. Our platform is recognized for delivering the highest-quality models with the fastest and most scalable inference capabilities in the industry. Proudly benchmarked as a leader in LLM inference speed, we are at the forefront of innovation with projects like function calling and multimodal models. As a Series C company valued at $4 billion, we are supported by prestigious investors including Benchmark, Sequoia, Lightspeed, Index, and Evantic. Our team comprises ambitious builders, many of whom hail from Meta PyTorch and Google Vertex AI.The Role:As an Applied Machine Learning Engineer, you will play a crucial role in bridging the gap between advanced AI research and practical applications. Your responsibilities will include developing, fine-tuning, and operationalizing machine learning models that deliver significant business value and enhance user experiences. This hands-on engineering position demands a blend of deep technical expertise and a strong customer-centric approach to create scalable AI solutions.Key Responsibilities:Customer Success: Collaborate with the Go-To-Market (GTM) team, including Account Executives and Solutions Architects, to ensure seamless integration and successful deployment of machine learning solutions.Demo / Proof of Concept (PoC): Develop and present engaging PoCs that showcase the capabilities of our AI technology.Application Build: Design, develop, and deploy end-to-end AI-powered applications customized to meet customer needs.Platform Features / Bug Fixes: Contribute to the internal machine learning platform by adding features and resolving issues.New Model Enablements: Integrate and enable new machine learning models within the existing platform or client environments.Performance Optimizations: Enhance the performance, efficiency, and scalability of deployed models and applications.Partnership Enablement: Collaborate closely with partners to facilitate joint AI solutions and ensure effective collaboration.

Mar 5, 2026

Apply

Senior Machine Learning Engineer

zaimler

Full-time|On-site|San Mateo, CA

About zaimlerAt zaimler, we recognize that AI agents struggle to make decisions based on fragmented data. Today’s enterprise data is scattered across numerous systems, lacking shared context and structure, which contributes to the challenges faced in enterprise AI. As we transition from copilots to fully autonomous agents, we're establishing a revolutionary infrastructure layer to support this evolution.zaimler serves as the foundational context infrastructure for the agentic era. Our platform autonomously uncovers domain knowledge, maps intricate relationships, and equips AI agents with the semantic understanding necessary for precise operations at scale. Envision knowledge graphs that facilitate real-time inference, designed specifically for systems that require reasoning capabilities, rather than mere data retrieval.Founded by industry veterans Biswajit Das and Sofus Macskassy, zaimler is a small yet highly skilled team at the seed stage, collaborating with major enterprises across sectors such as insurance, travel, and technology. If you aspire to develop the infrastructure that will underpin the next decade of AI advancements, we invite you to join our journey.About the RoleWe are in search of a Senior Machine Learning Engineer to enhance our dynamic team, ideally based in the Bay Area or open to relocation. The perfect candidate will possess strong expertise in one or more of the following areas: Knowledge Extraction, Natural Language Understanding, Unsupervised Learning, Information Retrieval, and Fine-tuning Large Language Models (LLMs). In this pivotal role, you will be instrumental in developing and training the models, pipelines, and methodologies that drive our semantic graph systems. We seek an individual with a robust background in machine learning, natural language processing, LLMs, and semantic technologies, along with a proven history of managing complex, large-scale machine learning projects.

Nov 27, 2024

Apply

Lead Principal Software Engineer - Economy Machine Learning (Data Systems)

Roblox

Full-time|$295.3K/yr - $354K/yr|On-site|San Mateo, CA, United States

Join the vibrant world of Roblox, where millions of users engage in limitless exploration, creativity, and social interaction through immersive 3D digital experiences, all crafted by our dynamic global community of developers and creators.At Roblox, we are dedicated to constructing the tools and platform that enable our community to transform their imaginative ideas into reality. Our mission is to revolutionize how individuals connect, regardless of their location or device. We aim to unite a billion people with hope and respect, and we are seeking exceptional talent to join us on this journey.A career at Roblox means playing a pivotal role in shaping the future of human interaction, tackling unique technical challenges at scale, and fostering safer, more respectful shared experiences for all.Who We Are:Help shape the future of Roblox’s virtual economy.The Economy ML team is at the forefront of developing the machine learning infrastructure that drives Roblox’s Marketplace, Developer Monetization, and Payments systems. Our initiatives include intelligent pricing strategies, personalized storefronts, dynamic layout optimization, and enhanced avatar understanding, all designed to transform user engagement, monetization, and creator success on a grand scale.As a Principal Software Engineer (Data Systems), you will be responsible for architecting, building, and deploying robust, high-scale real-time and batch data systems aimed at enhancing personalization, search, and recommendation functionalities across Marketplace, Developer Monetization, and Payments products. You will contribute to crucial data projects, from designing event taxonomies and logging interfaces to real-time feature serving across various search and recommendation platforms.Your ResponsibilitiesLead data engineering efforts for the Economy ML team, establishing standards for batch versus streaming feature pipelines, table design, observability, and documentation across the Economy division.Actively engage as a hands-on contributor to our data systems, driving content recommendation, search, and personalization across Economy product surfaces.Collaborate with Product, Data Science, and ML engineering teams to prioritize data initiatives that align with key business objectives and mitigate technical debt in existing datasets and pipelines.Guide and mentor fellow DE/ML engineers on best practices for schema design, client logging, and more.

Feb 20, 2026

Apply

Senior Machine Learning Engineer - Safety Expertise

Roblox

Full-time|$195.8K/yr - $242.1K/yr|On-site|San Mateo, CA, United States

At Roblox, millions of individuals engage daily to explore, create, play, learn, and connect with friends in immersive 3D digital experiences crafted by our global community of developers and creators.We are dedicated to building tools and platforms that empower our community to realize their imaginative experiences. Our vision is to transform how people connect from anywhere in the world and on any device. Our mission is to unite one billion people with optimism and civility, and we are seeking exceptional talent to help us achieve this goal.Joining Roblox means you will play a pivotal role in shaping the future of human interaction, tackling unique technical challenges at scale, and contributing to the creation of safer, more civil shared experiences for all.The Rights Manager and Content Suitability teams, part of our Safety Experience, are in search of a Senior Machine Learning Engineer.The Safety Experience organization develops tools and systems that empower Roblox users and creators to control their experiences while enabling moderators to uphold our community standards. Our focus includes education, intervention, visibility, and action.Our initiatives include:Monitoring and influencing user behavior for enhanced safety.Building scalable and efficient moderation platforms.Providing transparency and educational resources for parents and developers.Empowering users to manage their own safety.Allowing IP owners to manage their creations on Roblox.Delivering quick and accurate support through our Customer Care Chatbot.Your responsibilities will include:Implementing machine learning solutions for safety-focused systems.Promoting a culture of technical excellence and inclusivity.Breaking down long-term product requirements into actionable phases for continuous improvement.Designing and building large-scale machine learning models with billions of parameters, ensuring they are production-ready.Facilitating complex technical decisions with empathy and foresight.

Feb 10, 2026

Apply

Distinguished Machine Learning Engineer - Safety

Roblox

Full-time|$397.5K/yr - $455.7K/yr|On-site|San Mateo, CA, United States

Roblox is a vibrant platform where millions of users engage daily to explore, create, play, learn, and connect through immersive 3D experiences, all crafted by our diverse community of developers and creators.At Roblox, we are dedicated to building innovative tools and platforms that empower our community to turn their imaginative experiences into reality. Our vision is to revolutionize how people connect from anywhere in the world, across any device. We are on a purposeful mission to unite a billion individuals with positivity and respect, and we are seeking exceptional talent to help us achieve this goal.Joining Roblox means contributing to the evolution of human interaction, tackling unique technical challenges at scale, and fostering safer, more respectful shared experiences for all.In the role of Distinguished Machine Learning Engineer/Technical Director within the Safety organization at Roblox, you will spearhead the technical vision and execution of machine learning projects aimed at ensuring user safety and promoting civility. Our top-tier safety features are what make Roblox a secure and inclusive space for creative expression and shared experiences. The Safety team plays a critical role in maintaining Roblox as one of the safest platforms on the internet, protecting users as our platform grows to accommodate diverse age groups and locations.You Will:Take ownership of the technical direction and implementation of machine learning solutions for safety systems.Lead and mentor fellow engineers, fostering an environment of technical excellence and inclusivity.Translate long-term product requirements into manageable milestones, ensuring ongoing enhancements.Design and develop large-scale machine learning models with billions of parameters, ensuring they are production-ready.Navigate complex technical decisions across teams, showcasing empathy and fostering mutual understanding.Work collaboratively with cross-functional teams to outline and prioritize the machine learning roadmap.

Feb 10, 2026

Apply

Principal Machine Learning Engineer - Alt Defense at Roblox

Roblox

Full-time|$295.3K/yr - $345K/yr|On-site|San Mateo, CA, United States

Every day, millions of users engage with Roblox to explore, create, play, learn, and connect within vibrant 3D digital environments crafted by our diverse global community of developers and creators.At Roblox, we are dedicated to building tools and a platform that enable our community to turn their imaginative experiences into reality. Our vision is to transform how people connect, from any location around the globe, across any device. We are on a mission to unite a billion individuals with positivity and respect, and we are searching for exceptional talent to help us achieve this goal.A career at Roblox means participating in the evolution of human interaction, tackling unique technical challenges on a grand scale, and contributing to creating safer, more respectful shared experiences for all.WHY SAFETY?At Roblox, our commitment is to connect a billion people with optimism and civility. The Safety organization is dedicated to leading the way in fostering civil immersive online communities. We employ systematic strategies to detect, eliminate, and prevent harmful accounts, content, and behavior, ensuring that Roblox accounts remain secure and uncompromised. Our expertise covers a wide array of technological domains, including machine learning, 3D model classification, experimentation, automation, detection workflows, and AI-driven text filtering. Collaborating closely with product teams, we leverage our toolset to uncover new possibilities, influence product development, and enhance safety features while measuring their impact on our community of users and developers. This work is essential in maintaining Roblox as a safe, civil, and inclusive space, promoting positive interactions among individuals worldwide.WHY ALT DEFENSE?Safety and civility are our top priorities at Roblox. The Alt Defense team is at the forefront of this initiative, tasked with addressing one of the most challenging adversarial issues on our platform: Alternate Account Detection and Prevention. When malicious users are banned from Roblox, they often attempt to return under new identities. Our mission is to thwart these attempts immediately. As the Technical Lead for this team, you will design and implement a cutting-edge detection system operating at an enormous scale—analyzing billions of accounts and identifying repeat offenders within minutes. Additionally, you will support various use cases for alternate account detection throughout the organization. You will report directly to the Senior Engineering Manager of the Account Identity Team.

Feb 10, 2026

Apply

Senior Machine Learning Engineering Manager

Roblox Corporation

Full-time|On-site|San Mateo, CA, United States

Join Roblox as a Senior Machine Learning Engineering Manager and lead a dynamic team focused on developing cutting-edge machine learning solutions. In this role, you will be responsible for overseeing the design, implementation, and optimization of algorithms that enhance user experience and gameplay. Your leadership will guide the team in leveraging data to drive strategic decisions, ensuring our platform remains at the forefront of technology.

Mar 24, 2026

Apply

Principal Machine Learning Engineer - Reliability

Roblox Corporation

Full-time|On-site|San Mateo, CA, United States

Join Roblox as a Principal Machine Learning Engineer focused on enhancing reliability across our systems. In this pivotal role, you will leverage cutting-edge machine learning techniques to ensure the seamless performance and reliability of our platform. Collaborate with cross-functional teams to design and implement robust ML algorithms that enhance user experience and system functionality.

Mar 12, 2026

Apply

Machine Learning Engineer, ML Platform

Zaimler

Full-time|On-site|San Mateo, CA

About ZaimlerAt Zaimler, we believe that AI agents need a profound understanding of data to operate effectively. In today’s fragmented enterprise data landscape, where information is scattered across various systems without a unified context, traditional AI solutions often fall short. We are pioneering the transition from basic copilots to fully autonomous agents, creating a revolutionary infrastructure layer that makes this transition possible.Our platform serves as the foundational context infrastructure for the agentic era. We automatically uncover domain knowledge, establish meaningful relationships, and endow AI agents with the semantic comprehension necessary for precise operations at scale. Imagine knowledge graphs that not only facilitate real-time inference but are also designed for systems that require reasoning capabilities beyond mere retrieval.Founded by industry veterans Biswajit Das, former VP of Engineering at Truera and Chief Architect at Visa, and Sofus Macskassy, ex-Director of Engineering at LinkedIn and creator of one of the largest production knowledge graphs, Zaimler is a close-knit, senior team currently in the seed stage. We are collaborating with major enterprises across sectors such as insurance, travel, and technology. If you are passionate about shaping the infrastructure that will power the next decade of AI innovation, we would love to connect with you.About the RoleAs part of our Machine Learning team, you will focus on transforming raw enterprise data into structured, contextualized knowledge graphs and embeddings. Your responsibilities will include developing innovative and scalable algorithms for machine learning and data engineering aimed at enhancing system efficiency. You will also explore new methodologies to compress large models into more efficient versions, improve retrieval and reasoning performance through feedback mechanisms, and prototype techniques that enable large language models (LLMs) to extract and utilize real-world knowledge efficiently.We seek a candidate who thrives in a fast-paced environment, values meticulous work, and is eager to learn from some of the top engineers and researchers in the field.

Jul 22, 2025

Apply

Infrastructure Software Engineer

Skydio

Full-time|$140K/yr - $140K/yr|On-site|San Mateo, California, United States

Skydio builds autonomous drones for a wide range of users, from utility inspectors and first responders to military personnel in the field. Based in San Mateo, California, Skydio combines artificial intelligence expertise with advanced hardware and software development, always focused on customer needs. About the Cloud Infrastructure Team The Cloud infrastructure group keeps Skydio’s platform available whenever and wherever it’s needed, whether for routine inspections or urgent disaster response. With thousands of drones deployed worldwide, the team continually improves how infrastructure is delivered and updated. Role Overview The Infrastructure Software Engineer manages and evolves Skydio’s Kubernetes fleet, making key software changes to support new and changing requirements. This hybrid role spans both infrastructure and software, offering the chance to shape product architecture, security, and performance. The position suits someone who enjoys working across the stack and tackling a mix of challenges. What You’ll Do Redesign and maintain a growing Kubernetes fleet and its supporting systems. Improve and expand the continuous delivery pipeline for Skydio’s products. Work with teams from hardware to cloud to introduce new platform features. Partner with security experts to strengthen data and drone protection measures. Introduce cost-saving strategies early in the product lifecycle to support long-term growth. What We’re Looking For At least 2 years of experience in infrastructure or software engineering. Hands-on knowledge of Kubernetes and cloud platforms. Strong analytical and problem-solving skills, with a collaborative approach. Drive for innovation and a high standard of quality in your work. Location: San Mateo, California, United States

Apr 17, 2026

Apply

Manager of Machine Learning Engineering - Critical Harms

Roblox Corporation

Full-time|On-site|San Mateo, CA, United States

Join Roblox as a Machine Learning Engineering Manager focused on Critical Harms where you will lead innovative projects that harness the power of machine learning to enhance safety and user experience on our platform. You will oversee a talented team dedicated to developing cutting-edge algorithms and models that proactively identify and mitigate risks.

Mar 23, 2026

Apply

Senior Infrastructure Software Engineer

Skydio

Full-time|$170K/yr - $170K/yr|On-site|San Mateo, California, United States

Skydio, a premier drone manufacturer based in the United States, stands at the forefront of autonomous flight technology, paving the way for the future of drones and aerial mobility. Our diverse team merges profound expertise in artificial intelligence with top-tier hardware and software development, operational excellence, and a relentless focus on customer satisfaction. We empower a wide array of drone users, from utility inspectors to first responders and military personnel, to leverage our cutting-edge technology in various scenarios.About the Team: The Skydio Cloud Infrastructure team is dedicated to ensuring the Skydio Cloud platform is consistently available to our users at critical moments, whether conducting routine inspections or supporting rescue missions during emergencies. With a global fleet of thousands of drones, we are committed to continuous improvement, emphasizing robust delivery and testing pipelines as vital components of our operations.About the Role: As a Senior Infrastructure Engineer focused on an innovative product, you will play a pivotal role in maintaining our Kubernetes fleet and enhancing the core product software to meet evolving use cases. This position blends software engineering and infrastructure management, allowing you to address product deficiencies directly rather than solely relying on automation. We seek a professional who thrives on the autonomy to influence architecture, security, and functionality across the entire stack.Your Impact:Re-engineer and sustain the expanding requirements of our Kubernetes fleet and its underlying infrastructure.Enhance and broaden the continuous delivery processes for our product.Collaborate across teams (hardware to cloud) to introduce new capabilities to the platform.Engage directly with security teams to refine practices and controls that safeguard our customers' data and drones.Lead cost-saving initiatives early in the product lifecycle to ensure scalability.

Mar 4, 2026

Create account — see all 472 results

1 - 20 of 472 Jobs

Select all on this page (20)

Apply

Software Engineer: Machine Learning Infrastructure

Generalist

Full-time|On-site|San Francisco Bay Area (San Mateo) or Boston (Somerville)

Feb 12, 2026

Apply

Machine Learning Infrastructure Software Engineer

Genesis Therapeutics