About the job
Join Our Mission at Nuro
Nuro is pioneering the realm of self-driving technology with a vision to make autonomy accessible to everyone. Established in 2016, we are engineering the world’s most scalable driving solution by merging advanced AI with resilient automotive-grade hardware. Our flagship technology, the Nuro Driver™, is licensed for a variety of applications, from robotaxis to commercial fleets and private vehicles. With years of successful deployments, Nuro offers automakers and mobility platforms a streamlined pathway to commercial-scale autonomous vehicles, cultivating a safer, more integrated future.
Role Overview
Nuro is in search of a seasoned Technical Lead Manager with robust experience in large-scale infrastructure and workload orchestration, along with proficiency in batch and streaming data processing systems, to enhance our ML Infrastructure team. In this pivotal role, you will spearhead the development of our core platform, guaranteeing our researchers and engineers have uninterrupted access to the compute and data resources vital for advancing autonomous driving technology.
As a Technical Lead Manager, you will define the strategy for automated resource provisioning, high-performance workload scheduling, and efficient feature management. You will strike a balance between hands-on technical leadership and effective people management, guiding a talented team while collaborating closely with ML Research and Autonomy teams to remove infrastructure bottlenecks and expedite the Nuro Driver™ development process.
Key Responsibilities
As the Technical Lead Manager for ML Platform Infrastructure, you will construct the backbone that fuels Nuro’s model development journey from experimentation to production, including:
- Technical Strategy Development: Crafting a roadmap for a cohesive ML platform that simplifies complex cloud infrastructure.
- Resource Provisioning & IaC: Expanding our automated infrastructure-as-code (IaC) pipelines to oversee thousands of GPU/CPU nodes across multiple environments.
- Intelligent Scheduling: Engineering and fine-tuning workload orchestration to optimize hardware use, minimize job wait times, and manage extensive distributed training.
- Data Processing & ETL: Creating robust pipelines for the extraction and transformation of petabyte-scale sensor and telemetry data into machine learning-ready formats.
- Feature Management: Implementing effective feature caching and storage solutions to diminish redundant computations and guarantee quick access to pre-computed features.
- Team Leadership: Fostering a high-performance culture while mentoring team members.

