About the job
Join Avride, an innovative leader in autonomous mobility, as we develop cutting-edge self-driving vehicles and delivery robots. Our mission is to revolutionize transportation and logistics, with our Labeling Team playing a critical role in realizing this vision.
The Labeling Backend Team is responsible for establishing the data infrastructure that drives our research and development efforts in labeling pipelines, data preparation workflows, and model training processes. The high-quality labeled data we produce is essential for advancing our core technologies and supports a variety of models that underpin our business.
About the Role
We are seeking a talented Research Engineer dedicated to enhancing the quality and representativeness of the datasets that fuel our self-driving systems. In this role, you will design algorithms and tools for auto-labeling, data mining, and dataset monitoring, leveraging your strong Python engineering skills alongside applied machine learning concepts. Your contributions will significantly improve data efficiency, reduce labeling costs, and elevate model performance.
Key Responsibilities
- Create and implement algorithms that optimize annotation processes, including auto-labeling systems that minimize manual effort and maximize throughput.
- Develop data-mining and active-learning pipelines to identify the most valuable samples for training.
- Establish dataset-quality monitoring systems that detect noise, redundancy, and low-value data.
- Construct analytical platforms (databases, dashboards, reporting) to monitor dataset quality and coverage over time.
- Collaborate closely with ML and Perception teams to integrate research findings into production workflows.
- Investigate emerging methodologies (vision-language models, weak supervision, uncertainty estimation) to enhance dataset quality and automation.

