company

Machine Learning Research Engineer - Data at Liquid AI | San Francisco

Liquid AISan Francisco
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Entry Level

Qualifications

Bachelor's degree in Computer Science, Data Science, Machine Learning, or related field. Experience with data processing frameworks such as TensorFlow, PyTorch, or similar. Strong programming skills in Python or similar languages. Familiarity with machine learning algorithms and techniques. Ability to work collaboratively in a team environment.

About the job

About Liquid AI

Founded as a spin-off from MIT CSAIL, Liquid AI specializes in creating versatile AI systems designed for optimal performance across various deployment platforms, including data center accelerators and on-device hardware. Our technology emphasizes low latency, minimal memory consumption, privacy, and dependability. We collaborate with leading enterprises in sectors such as consumer electronics, automotive, life sciences, and financial services. As we experience rapid growth, we are on the lookout for exceptional talent to join our team.

The Opportunity

The Data team at Liquid AI drives the development of our Liquid Foundation Models, focusing on pre-training, vision, audio, and emerging modalities. With the stagnation of public data sources, the effectiveness of our models increasingly relies on specially curated datasets. We are seeking engineers with a machine learning mindset who can efficiently gather, filter, and synthesize high-quality data at scale.

At Liquid AI, we regard data as a research challenge rather than an infrastructural issue. Our engineers conduct experiments, design ablations, and assess how data-related decisions impact model quality. We will align you with a team where you can experience rapid growth and make a significant impact, be it in pre-training, post-training reinforcement learning, vision-language, audio, or multimodal applications.

While we prefer candidates in San Francisco and Boston, we are open to considering other locations.

What We're Looking For

We are in search of a candidate who:

  • Thinks like a researcher and executes like an engineer: You should be able to formulate hypotheses, conduct experiments, and evaluate results. Our engineers produce research-level code while our researchers implement production systems.
  • Learns quickly and adapts: You will be working in rapidly evolving modalities, so the ability to quickly grasp new domains and thrive in ambiguity is essential.
  • Prioritizes data quality: We hold data quality in high regard; tasks such as filtering, deduplication, augmentation, and evaluation are key responsibilities, not afterthoughts.
  • Solves problems autonomously: Data engineers operate within training groups (pre-training and multimodal). While collaboration is crucial, we expect ownership and self-direction.

The Work

  • Develop and maintain data processing, filtering, and selection pipelines at scale.
  • Establish pipelines for pretraining, midtraining, supervised fine-tuning, and preference optimization datasets.
  • Design synthetic data generation systems utilizing large language models (LLMs), structured prompting, and domain-specific generative techniques.

About Liquid AI

Liquid AI is a pioneering AI firm emerged from MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL), dedicated to the development of efficient AI systems for diverse deployment environments. Our innovative solutions cater to industries such as consumer technology, automotive, healthcare, and finance, where we are committed to driving advancements and optimizing performances.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.