companyLlamaIndex logo

Multimodal AI Engineer specializing in Document Understanding

LlamaIndexSan Francisco
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Mid to Senior

Qualifications

Responsibilities:Develop, train, and optimize machine learning models focused on document structure understanding, table extraction, layout analysis, and multimodal content processing. Create efficient data pipelines, evaluation frameworks, and experimentation infrastructures. Design and implement production-level ML systems capable of processing complex, real-world documents at scale. Stay updated on the latest advancements in vision-language models, document AI, and multimodal learning. Collaborate with engineering teams to integrate ML innovations into production APIs. Contribute to our open-source frameworks and enterprise solutions. Make informed technical decisions while balancing research exploration with product delivery. Required Qualifications:3-7 years of experience in machine learning engineering or applied research. Strong software engineering fundamentals with experience in production Python (familiarity with tools like uv, ruff, mypy, Pydantic). Hands-on experience in training, fine-tuning, or deploying ML models in production environments. In-depth understanding of modern ML techniques, particularly in computer vision, NLP, or multimodal learning. Experience with at least one ML framework such as TensorFlow or PyTorch.

About the job

Join our innovative team and help define the future of AI, focusing on the narrative of document understanding.

About the Role:

We are on the lookout for talented AI engineers to become a part of our dedicated document understanding team. In this role, you'll be at the crossroads of computer vision, natural language processing, and production machine learning systems, driving advancements in document parsing and comprehension.

Our team powers LlamaParse, LlamaExtract, and other advanced processing solutions, handling millions of intricate documents such as PDFs, PowerPoints, Word files, and spreadsheets. Your contributions will significantly influence numerous developers who are creating RAG applications and document agents, in addition to enhancing our open-source frameworks that revolutionize industry standards in document processing.

Depending on your expertise and interests, you may concentrate on data curation, model fine-tuning, or ML infrastructure. We are hiring multiple candidates and will collaborate with you to identify the perfect role for your skills.

About LlamaIndex

At LlamaIndex, we are committed to pushing the boundaries of artificial intelligence, particularly in the realm of document understanding. Join us in our mission to innovate and redefine how AI interacts with complex documents.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.