About the job
About Us
Join us in setting a global standard for video understanding AI.
At Twelve Labs, we are creating world-class video-specialized AI models that efficiently process vast amounts of video data, offering advanced capabilities in video search, analysis, summarization, and insights generation.
Our models are utilized by the largest sports leagues worldwide to swiftly and accurately curate highlights from extensive game footage, providing a hyper-personalized viewing experience. In South Korea, integrated control centers leverage our technology to efficiently navigate CCTV footage during crisis situations, while major broadcasters and studios globally utilize our models to create content for billions of viewers.
As a deep tech startup with offices in San Francisco and Seoul, Twelve Labs has been recognized as one of the top 100 AI startups by CB Insights for four consecutive years. We have raised over $110 million from esteemed venture capital firms and companies including NVIDIA, NEA, Index Ventures, Databricks, and Snowflake. Our AI model, uniquely developed in Korea, is exclusively available through Amazon Bedrock. We are committed to building innovative products alongside exceptional colleagues and growing with customers worldwide.
At Twelve Labs, we operate on core values that include:
A commitment to honesty and reflection regarding ourselves and our team
Resilience and humility in the face of failure and feedback
An ongoing pursuit of learning to enhance team capabilities
If you enjoy solving challenging problems and growing through the process, the opportunity awaits you here at Twelve Labs.
About the Team
Our ML Data Operation team oversees all aspects of video-language data collection, labeling, and quality assurance. This position entails defining dataset requirements in collaboration with the Research and Product teams, designing and building data pipelines, and coordinating with external vendor partners for large-scale execution.
Additionally, automating repetitive partnership management and annotation quality assessment tasks is a key responsibility. Success in this role demands a strong commitment to collaboration and relationship-building with relevant departments.
About the Role
In this role, you will lead video-language data collection, labeling, and quality control, playing a pivotal role in the data pipeline that underpins machine learning model training.
Project Management and Operations: Design and execute video-language data collection and labeling projects while automating repetitive processes to maintain a balance between efficiency and quality.
External Partner Collaboration: Enhance project quality through seamless collaboration with vendors and outsourcing partners.
Data Quality and Tooling Enhancement: Establish labeling guidelines, monitor data quality, and improve tools and infrastructure for a sustainable data operation system.
Portfolio Monitoring: Manage resource allocation and schedules for all data streams within your product vertical based on real-time insights, adjusting directions flexibly as needed.
Internal Collaboration: Work with engineering and AI model teams to align on priority data needs, design tools such as analysis reports and dashboards, and communicate project progress clearly.
Ideal Candidate
You may be a suitable candidate if you possess:
Over 5 years of experience in AI-centric data operations
Proven experience in designing and executing large-scale data projects, including collection, labeling, and post-processing
Strong analytical skills and the ability to manage complex and dynamic projects

