About the job
About Cartesia
At Cartesia, our vision is to create the future of artificial intelligence—intelligent systems that are seamlessly integrated into daily life. We aim to overcome current limitations by enabling models to continuously understand and analyze vast streams of audio, video, and text data—ranging from 1 billion text tokens to 1 trillion video tokens—right on your device.
Our pioneering team, comprised of PhDs from the Stanford AI Lab, has developed State Space Models (SSMs), a groundbreaking approach to training efficient, large-scale foundation models. With a rich blend of expertise in model innovation and systems engineering, alongside a product-focused engineering team, we are committed to developing and delivering cutting-edge AI models and user experiences.
Supported by prominent investors including Index Ventures and Lightspeed Venture Partners, as well as many esteemed advisors and over 90 angel investors from diverse industries, we are at the forefront of AI advancements.
About The Role
In our quest to create truly global AI, we must train our models using datasets that represent the vast diversity of languages and cultures around the world. We are looking for a Research Engineer to take charge of the quality and comprehensiveness of the data that drives our models. As our in-house expert in global data, you will ensure that our models excel across multiple languages, leveraging your keen understanding of linguistic subtleties and your enthusiasm for building inclusive, large-scale datasets.
Your Impact
Design and construct extensive datasets for model training, conducting controlled experiments to evaluate their effect on model performance.
Develop assessments for speech models through both manual annotation and automated evaluation metrics.
Utilize data generation techniques to enhance model intelligence and reduce biases.
Create automated quality control systems to validate and filter the generated data.
Collaborate with product teams to ensure optimal support for key languages and markets.
What You Bring
Proven experience in developing or working with extensive multilingual datasets.
Familiarity with generative models, including speech, text, or multimodal systems.
Ability to guide human annotation and evaluation across various languages.
Strong analytical skills and a passion for data-driven decision-making.

