companysensmore logo

PhD Research Internship – Robotics Engineer (VLM / VLA Models)

sensmoreBerlin / Potsdam
On-site Internship

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Entry Level

Qualifications

Candidates should possess a strong background in robotics, AI, or a related field, with a focus on Vision-Language Models and multi-modal learning. A PhD enrollment is required, along with a passion for research and innovation. Strong programming skills and experience in data analysis are essential.

About the job

sensmore builds automation systems for heavy machinery, applying intelligent robotics to help equipment such as wheel loaders adapt to changing tasks and environments. Their Physical AI platform connects robotics with real-world industrial needs, aiming to boost productivity and safety across sectors like mining and construction.

This PhD Research Internship centers on advancing industrial automation, blending research and engineering in a practical setting. The position is based in Berlin or Potsdam and focuses on Vision-Language Models (VLM) and Vision-Language-Action (VLA) systems for robotics.

Role overview

The internship targets general purpose AI, with an emphasis on developing scalable VLA systems that enable robots to perceive, reason, and act in complex industrial environments. The work combines multi-modal perception, including video, radar, and lidar, with practical robotics. Interns will contribute to embodied AI research for heavy industry, working at the intersection of method development and hands-on engineering. There are opportunities to publish research and influence the direction of industrial autonomy at sensmore.

Key responsibilities

  • Research and method development:
    • Design and implement new approaches for Vision-Language-Action systems in industrial contexts.
    • Investigate scalable architectures for multi-modal reasoning and action generation.
    • Advance methods in embodied AI and robotic autonomy.
  • Multi-modal learning and data systems:
    • Lead the design and evaluation of large-scale multi-modal datasets, including video, radar, lidar, and sensor fusion.
    • Develop self-supervised or weakly supervised pipelines for generating VLA datasets.
    • Explore data-centric strategies to improve robustness and generalization.
  • Model development and optimization:
    • Build, adapt, and extend advanced models to achieve project objectives.

About sensmore

sensmore is at the forefront of transforming heavy machinery with intelligent automation. Our groundbreaking Physical AI technology enables machines to operate autonomously in complex environments, setting new standards in productivity and safety across various industries.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.