companyLila Sciences logo

Machine Learning Research Scientist I/II - Multimodal Data Extraction

Lila SciencesCambridge, MA USA
On-site Full-time $176K/yr - $304K/yr

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Mid to Senior

Qualifications

What You’ll Need to Succeed PhD or equivalent research experience in Computer Science, Chemistry, Materials Science, or a related discipline. In-depth knowledge of machine learning, NLP, and vision-language modeling utilizing PyTorch and Hugging Face Transformers. Demonstrated capability in training, fine-tuning, and evaluating LLMs and multimodal models tailored for scientific data extraction. Strong comprehension of data structures and representations prevalent in the physical sciences. Proven research impact through publications, preprints, or contributions to open-source projects (e.g., NeurIPS, ICLR, ICML, ACL, EMNLP, Scientific Journals).

About the job

Your Impact at Lila

As an ML Research Scientist specializing in Multimodal Data Extraction, you will play a pivotal role in advancing Lila's mission of achieving scientific superintelligence. Your work will focus on the development of foundational models capable of autonomously reading, interpreting, and organizing scientific knowledge from diverse formats such as text, images, and experimental data in the physical sciences. Your research will contribute to the unification of global scientific data into a machine-readable format, enhancing reasoning, prediction, and autonomous discovery within materials science and chemistry.

What You Will Be Building

  • Innovate and create AI systems that effectively extract and organize knowledge from a variety of scientific resources.
  • Design and optimize large language models, multimodal models, and specialized architectures for accurate and interpretable data extraction.
  • Develop scalable solutions for managing unstructured and heterogeneous scientific data, integrating various formats including text, tables, and visuals.
  • Collaborate with subject matter experts to ensure that the extracted data aligns with real-world research workflows.
  • Publish impactful research that propels the field of multimodal understanding and AI-driven knowledge extraction forward.

About Lila Sciences

At Lila Sciences, we are committed to revolutionizing the way scientific knowledge is accessed and utilized. By harnessing advanced artificial intelligence technologies, we aim to empower researchers and scientists in their quest for discovery and innovation.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.