About the job
About NomadicML
At NomadicML, we are harnessing the power of artificial intelligence to revolutionize the way machines understand and interpret motion. Our vision-language models (VLMs) transform vast amounts of video data into actionable insights, paving the way for advancements in self-driving technology, robotics, and industrial automation.
Founded by Mustafa Bal and Varun Krishnan, both alumni of Harvard University, our team is comprised of experts who have previously developed critical AI systems at industry giants like Snowflake, Lyft, Microsoft, Amazon, and IBM Research. With a commitment to innovation, we are dedicated to mining insights from the 5 trillion miles driven by Americans annually, uncovering the next frontier in machine intelligence.
About the Role
We are looking for a passionate Machine Learning Engineer who excels at the intersection of foundational model research and production engineering. In this role, you will play a key part in optimizing how machines learn from motion, focusing on training and refining large-scale Vision-Language Models that analyze complex real-world video data.
You will be responsible for creating multi-modal architectures that accurately perceive, localize, and describe motion events across millions of video frames, transforming these innovations into robust APIs and SDKs for enterprise clients.
Working closely with the founders, your contributions will include:
- Training and assessing VLMs tailored for motion comprehension within autonomous driving and robotics datasets.
- Designing and scaling GPU-accelerated pipelines for training, fine-tuning, and inference on diverse data types (video, language, and sensor metadata).
- Developing evaluation frameworks that benchmark spatiotemporal reasoning and localization precision.

