company

Software Engineer - Machine Learning Data

Twelve LabsSeoul, South Korea
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

We are seeking candidates with:Strong programming skills in languages such as Python, Java, or Scala. Experience with data processing frameworks like Apache Spark or Hadoop. Familiarity with machine learning principles and practices. Knowledge of database management and data storage solutions. Excellent problem-solving abilities and a collaborative mindset.

About the job

Who We Are

Join us in setting global standards for video understanding AI! Twelve Labs is dedicated to developing cutting-edge AI models specifically for video content, enabling efficient processing of vast amounts of video data. Our technology offers advanced capabilities for search, analysis, summarization, and generating insights from video.

Our models are utilized by the largest sports leagues worldwide, quickly and accurately selecting highlights from extensive game footage, providing a hyper-personalized viewing experience. In South Korea, integrated control centers partner with us to efficiently analyze CCTV footage for rapid crisis response. Major broadcasters and studios across the globe leverage our models to create content for billions of viewers.

Headquartered in San Francisco with an office in Seoul, Twelve Labs is a Deep Tech startup recognized for four consecutive years as one of the Top 100 AI Startups by CB Insights. We have secured over $110 million in funding from leading venture capital firms and corporations, including NVIDIA, NEA, Index Ventures, Databricks, and Snowflake. Our AI models are uniquely available through Amazon Bedrock, and we thrive on innovation and collaboration with exceptional colleagues worldwide.

At Twelve Labs, we operate on core values that include:

  • Honesty and reflection about ourselves and our teams

  • Resilience and humility, embracing failure and feedback

  • A commitment to continuous learning and enhancing team capabilities

If you enjoy tackling challenging problems and growing through the journey, the opportunity awaits you here at Twelve Labs.

About the Team

Our ML Data team operates on the belief that data determines AI model performance. We build high-quality data for training and evaluating multimodal AI models end-to-end. This includes gathering, filtering, processing, and labeling various types of multimodal data such as video, images, and audio. We collaborate with diverse teams to design datasets that unlock new model capabilities and develop evaluation datasets that reflect real user experiences. We also develop and continually enhance internal tools to perform these processes efficiently.

The ML Data team plays a pivotal role in the development of Twelve Labs' world-class video understanding models through a meticulously designed data pipeline.

About the Role

As a Software Engineer specializing in Data, you will design and develop pipelines for multimodal (video, image, audio) data that fundamentally enhance model performance through data quality. If you have experience designing and operating distributed systems for handling unstructured multimodal datasets, you can make a significant impact in this position. The rigorously refined and accurately labeled data forms the foundation of all model development at Twelve Labs, and you will have the opportunity to influence model quality more than any other engineering role.

We are looking for someone to help us build data infrastructure that elevates our video understanding technology to the next level.

In this Role, You Will

  • Build data engines capable of collecting, preprocessing, refining, filtering, and labeling large multimodal (video, image, audio) datasets for LLM/VLM training.

  • Design and develop data systems that efficiently manage and visualize petabyte-scale video, image, and audio data.

  • Create libraries and services that deliver tangible impact beyond just eye-catching features.

  • Collaborate closely with various teams to define project priorities and goals, leading technical initiatives from planning through development and operations.

About Twelve Labs

Twelve Labs is a pioneering Deep Tech startup focused on revolutionizing video understanding through advanced AI models. With offices in San Francisco and Seoul, we leverage cutting-edge technology to provide unparalleled insights and capabilities for video content processing. Backed by significant funding and recognized as a top AI startup, we are committed to innovation and excellence in our field.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.