Tech Lead/Manager, Machine Learning Research Scientist - LLM Evaluations

Scale AI, Inc.San Francisco, CA; Seattle, WA; New York, NY

On-site Full-time $280K/yr - $380K/yr

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.

Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Manager

Qualifications

Key Responsibilities:Lead a high-performing team of research scientists and engineers focused on LLM evaluations. Conduct research on the effectiveness and constraints of current LLM evaluation techniques. Design and develop innovative evaluation benchmarks for large language models, addressing areas such as instruction adherence, factual accuracy, robustness, and fairness. Foster communication and collaboration with clients and peer teams to facilitate cross-functional initiatives. Work with internal teams and external partners to refine metrics and establish standardized evaluation protocols. Implement scalable and reproducible evaluation pipelines using modern machine learning frameworks. Publish research findings in top-tier AI conferences and contribute to open-source benchmarking initiatives. Stay current with ongoing research within the team, assist in overcoming technical challenges, and engage in design decision-making. Maintain strong involvement in the research community, both understanding trends and influencing them. Excel in a dynamic, fast-paced startup environment and commit to driving impactful results. Desired Qualifications:5+ years of practical experience in large language models, natural language processing, and Transformer modeling, in both research and engineering contexts. A proven track record of achieving significant research impacts in a fast-paced setting. Experience in supporting and leading a team of research scientists and engineers.

About the job

Our Research teams collaborate with top AI laboratories to provide high-quality data and expedite advancements in Generative AI research. As the Tech Lead/Manager of the LLM Evaluations Research team, you will guide a skilled team of research scientists and engineers dedicated to crafting and applying innovative evaluation methodologies, metrics, and benchmarks that assess the strengths and weaknesses of our advanced LLMs. This pivotal role involves designing and executing a strategic roadmap that establishes best practices in data-driven AI development, thus accelerating the development of the next generation of generative AI models in collaboration with leading foundational model labs.

About Scale AI, Inc.

Scale AI is the leading evaluation partner for advanced AI companies, focused on enhancing the benchmarking and assessment of large language models through innovative methodologies and collaboration with top research labs.

Similar jobs

1 - 20 of 4,275 Jobs

Search for Tech Lead For Deep Research Agent Development

4,275 results

Select all on this page (20)

Apply

Tech Lead for Deep Research Agent Development

Scale AI

Full-time|$252K/yr - $315K/yr|On-site|San Francisco, CA; New York, NY

Scale AI is on the lookout for an exceptional Staff/Senior Staff Machine Learning Engineer to serve as the Tech Lead (TL) for our innovative deep research agents tailored for enterprise applications. In this pivotal role, you will spearhead the technical vision and oversight for Deep Research Agent Development, converting state-of-the-art research in Generative AI, Large Language Models (LLMs), and Agentic Frameworks into powerful, scalable production systems that significantly improve enterprise operations and analytics.The successful candidate will excel in a dynamic environment, possess a fervor for intricate technical challenges, and demonstrate an aptitude for mentorship, while adeptly establishing a long-term technical strategy in a crucial domain without losing focus on hands-on execution.

Mar 26, 2026

Apply

Developer Relations Lead

Pluralis Research

Full-time|On-site|San Francisco

Pluralis Research is at the forefront of Protocol Learning—an innovative decentralized approach to training and deploying AI models that democratizes access to this technology for individuals, rather than just large corporations. By aggregating computing resources from numerous contributors, incentivizing participation, and ensuring no single entity can dominate the model's complete weights, we are forging a truly open and collaborative pathway to cutting-edge AI.Role OverviewWe are seeking a passionate Developer Relations Lead to serve as the crucial technical liaison between Pluralis's research initiatives and the broader machine learning and systems communities. In this role, you will transform complex, groundbreaking research (including distributed training, communication-efficient model parallelism, and fault-tolerant optimization) into clear, engaging, and accessible content for researchers, engineers, and innovators.This position is not merely a traditional marketing role. We are looking for an individual who can digest our research papers, grasp the underlying architecture, and convey these insights effectively through blog posts, conference presentations, or social media updates. You will shape our technical narrative and become the face of Pluralis's contributions within the community.

Mar 25, 2026

Apply

Tech Lead Manager - Machine Learning Research and Engineering

Scale AI

Full-time|$252K/yr - $315K/yr|On-site|San Francisco, CA; New York, NY

Join Scale's innovative Large Language Model (LLM) post-training platform team, where you will contribute to the development of our internal distributed framework designed specifically for LLM training. This sophisticated platform empowers Machine Learning Engineers (MLEs), researchers, data scientists, and operators to perform rapid and automated training and evaluation of LLMs. Additionally, it underpins the training framework for our data quality evaluation pipeline.Scale is at the forefront of the Artificial Intelligence sector, acting as a vital provider of training and evaluation data, as well as comprehensive solutions for the entire machine learning lifecycle. In this role, you will collaborate closely with Scale’s ML teams and researchers to construct the foundational platform that supports all our ML research and development initiatives. Your work will involve building and optimizing this platform to facilitate the training, inference, and data curation of next-generation LLMs.If you are passionate about driving the future of AI through groundbreaking innovations, we invite you to connect with us!

Mar 26, 2026

Apply

Lead Developer for AI Agent Technology

Postman

Full-time|$256K/yr - $276K/yr|On-site|San Francisco, California, United States

Who Are We?Postman is the premier API platform, empowering over 45 million developers and 500,000 organizations globally, including 98% of the Fortune 500. Our mission is to enable the creation of an API-first world by simplifying every phase of the API lifecycle and enhancing collaboration, allowing users to build superior APIs more efficiently.Headquartered in San Francisco, Postman also has offices in Boston, New York, Austin, Tokyo, London, and Bangalore, where we originated. As a privately held company, we are backed by prominent investors such as Battery Ventures, BOND, Coatue, CRV, Insight Partners, and Nexus Venture Partners. Discover more at postman.com or engage with us on X via @getpostman.P.S: We encourage you to read The "API-First World" graphic novel to gain insight into our vision at Postman.The OpportunityAs the AI Agent Development Lead, you will spearhead the design, development, and deployment of cutting-edge AI agents that engage with users and complex environments. You will play a crucial role in shaping the architecture and execution of scalable, dependable AI systems, collaborating closely with research, product, and engineering teams to create safe, interpretable, and efficient AI technologies.What You’ll DoLead a diverse engineering team focused on AI agent development from initial design through to production deployment.Design and implement AI agent architectures utilizing cutting-edge language models and related technologies.Work alongside research scientists to conduct scalable experiments and translate research innovations into product features.Drive the evolution of agent capabilities, including dialogue management, decision-making, and autonomy.Ensure that AI safety and alignment principles are embedded throughout the agent lifecycle.Mentor and develop technical team members, fostering a culture of collaboration and innovation.Assess new tools, frameworks, and methodologies to enhance AI agent development.

Mar 19, 2026

Apply

Research Lead

abundant

Full-time|Remote|San Francisco

abundant seeks a Research Lead based in San Francisco. This position steers research activities that help shape the company’s direction. The Research Lead partners with colleagues to analyze data, draw meaningful insights, and support projects where research has a clear business impact. Key responsibilities Plan, manage, and execute research initiatives from start to finish Work with team members to analyze data and spot important trends Turn research results into practical recommendations for the business Support projects that guide company strategy Collaboration and impact This role involves close teamwork and communication across departments. Research findings directly inform business decisions and contribute to the company’s ongoing growth.

Apr 24, 2026

Apply

Tech Lead Manager - Agentic Runtime

gleanwork

Full-time|On-site|San Francisco Bay Area

As a Tech Lead Manager for Agentic Runtime at gleanwork, you will be at the forefront of technological innovation, driving the development of cutting-edge solutions. You will lead a talented team of engineers, fostering a culture of collaboration and excellence while ensuring that our products meet the highest standards of quality and performance.Your role will involve strategic planning, overseeing project execution, and mentoring team members to enhance their skills and career growth. You will work closely with cross-functional teams to identify opportunities for improvement and implement new technologies that align with our business goals.

Mar 25, 2026

Apply

Researcher, Agentic Post-Training

OpenAI

Full-time|On-site|San Francisco

Role overview OpenAI is looking for a Researcher focused on Agentic Post-Training, based in San Francisco. This role centers on analyzing and improving how AI systems behave after their initial training. The goal is to broaden the capabilities of AI and refine how models respond in complex situations. What you will do Study and assess agentic behaviors in trained AI models Create new approaches to strengthen these behaviors after training Collaborate with a talented team on projects that shape the future of artificial intelligence research Collaboration and impact This position involves hands-on research with other specialists at OpenAI. The work directly supports the advancement of AI capabilities and helps define new benchmarks for agentic performance in artificial intelligence.

Apr 23, 2026

Apply

Tech Lead at Oliv.AI | Remote

Oliv.AI

Full-time|Remote|Remote - India

About UsOliv.AI is an innovative global startup in the SalesTech sector, based in San Francisco. We are proud to introduce the world’s first team of AI Agents designed specifically for sales. Following our successful $5.2M Seed funding, we aim to tackle one of the most pressing issues faced by revenue teams: unreliable deal data. Oliv captures Deal Intelligence from every meeting, call, and email—without any input from sales representatives. The outcome is a clear, detailed overview of every deal, displayed in scorecards based on trusted sales methodologies such as MEDDICC, BANT, and SPICED. Our AI agents are tailored for sales teams, including sales managers, account executives, and revenue operations, managing tasks that distract them from their primary focus—selling. With Oliv AI, sales teams can refocus on deals, strategy, and meaningful conversations.About The Role:As a member of our core engineering team, you will play a crucial role in designing, building, and overseeing products that our users cherish and rely on daily. You will utilize Clojure(script), a functional programming language that transforms software development into a truly joyful experience.We seek engineers who possess a robust customer-centric mindset and a profound sense of product ownership. Attention to detail and a passion for continuous learning about new technologies, systems, and methodologies are essential. You will not only contribute to creating outstanding products but also to building a great company grounded in first principles. You will thrive in our environment if you enjoy putting on your engineering hat, challenging norms, and solving complex problems.Our small but passionate team of engineers is committed to developing exceptional software for our users. We move swiftly and implement strong feedback loops, both internally and externally, to foster continuous improvement. We chose Clojure(script) as our programming language to enhance our productivity and deliver high-quality, bug-free products. We embrace system thinking with a focus on principled reasoning. We firmly believe that trust and autonomy are vital for developing the best products and delivering an exceptional customer experience. In summary, we aim to empower our users to be more productive and contribute to making the world a better place.

Feb 25, 2026

Apply

Research Scientist in Agent Robustness

Scale AI

Full-time|$197.4K/yr - $246.8K/yr|On-site|San Francisco, CA; New York, NY

Join Scale Labs as a Research Scientist — Agent RobustnessScale is the premier partner for data and evaluation within the forefront of AI innovation, playing a crucial role in understanding and safeguarding AI models and systems. Building on our extensive expertise, Scale Labs has initiated a dedicated team focused on policy research, aiming to connect AI research with global policymakers to facilitate informed, scientifically grounded decisions regarding AI risks and capabilities.Our research addresses complex challenges in agent robustness, AI control protocols, and AI risk evaluations, empowering governments, industries, and the public to comprehend and mitigate AI risks while promoting AI adoption. This team collaborates across various sectors, including industry, public services, and academia, and regularly disseminates our findings. We are actively inviting skilled researchers to contribute to this vision.As a Research Scientist specializing in Agent Robustness, you will tackle foundational challenges in creating AI agents that are both safe and aligned with human values. Your responsibilities may include:Investigating the science behind AI agent capabilities, focusing on safety, risk factors, and benchmarking methodologies.Designing and building testing harnesses to evaluate AI agents' tendencies to engage in harmful actions under user pressure or environmental manipulation.Creating exploits and mitigations for new failure modes that emerge as AI agents gain capabilities such as coding, web browsing, and computer usage.Characterizing and developing mitigations for potential failure modes or broader risks involving multiple interacting AI agents.

Mar 26, 2026

Apply

Tech Lead Manager - Agents

Perplexity

Full-time|On-site|San Francisco

Join Perplexity as a Tech Lead Manager (TLM) to spearhead the dynamic Agents engineering team. Our Agents team, composed of AI/ML, backend, and full-stack engineers, collaborates to craft engaging agentic experiences within our Comet ecosystem. Our mission is to empower users with AI agents that effectively translate their intent through seamless interactions with the digital world.As the TLM for the Agents team, you will leverage your AI expertise, intuitive product vision, and robust engineering management skills to push the boundaries of what our agents can achieve for our millions of loyal users. Your leadership will be pivotal in tackling numerous challenges in AI, including:Crafting AI agents capable of navigating the digital landscape and delivering valuable outcomes for users;Training sophisticated action and decision models that interpret complex multimodal inputs to fulfill user-defined goals;Ensuring exceptional user experiences across various platforms including desktop, mobile, and cloud environments through adaptable abstractions;Developing secure-by-design architectures and classifiers to empower agentic functionality;Designing effective data representations and interaction modes between agents and their environments;and much more.Key ResponsibilitiesProvide innovative technical leadership across various facets of our expanding AI agents product.Utilize advanced AI models, infrastructure, and browser technologies to elevate our capabilities for a growing user base.Apply keen technical and product insight to shape system architectures and product strategies.Enhance product reliability, code quality, AI assessment, testing, and upkeep across the team.Lead recruitment, onboarding, and mentorship for a rapidly expanding team, developing rigorous interview processes and collaborating with recruiting teams.Engage with Perplexity co-founders to achieve strategic objectives that redefine possibilities in the AI sector.Desired QualificationsProven experience in AI/ML technologies, software engineering, and leading diverse teams.Strong capabilities in product management and strategic planning within a tech-driven environment.Exceptional communication skills, with the ability to articulate complex technical concepts to non-technical stakeholders.A passion for innovation and the ability to thrive in a fast-paced, evolving landscape.

Dec 20, 2025

Apply

Tech Lead/Manager, Machine Learning Research Scientist - LLM Evaluations

Scale AI, Inc.

Full-time|$280K/yr - $380K/yr|On-site|San Francisco, CA; Seattle, WA; New York, NY

As a premier data and evaluation partner for cutting-edge AI firms, Scale AI is committed to enhancing the evaluation and benchmarking of large language models (LLMs). We are developing industry-leading LLM evaluations that set new benchmarks for model performance assessment. Our mission is to create rigorous, scalable, and equitable evaluation methodologies that propel the next evolution of AI capabilities.Our Research teams collaborate with top AI laboratories to provide high-quality data and expedite advancements in Generative AI research. As the Tech Lead/Manager of the LLM Evaluations Research team, you will guide a skilled team of research scientists and engineers dedicated to crafting and applying innovative evaluation methodologies, metrics, and benchmarks that assess the strengths and weaknesses of our advanced LLMs. This pivotal role involves designing and executing a strategic roadmap that establishes best practices in data-driven AI development, thus accelerating the development of the next generation of generative AI models in collaboration with leading foundational model labs.

Mar 26, 2026

Apply

AI Agent Development Engineer | Flowith | San Francisco

Flowith

Full-time|On-site|San Francisco, California, United States

Role OverviewJoin Flowith as an innovative AI Agent Development Engineer, where you will serve as the "Architect of Intelligence." Your role involves crafting the cognitive frameworks behind our cutting-edge products. You will connect the intricacies of Large Language Models with real-world applications by developing advanced AI Agents. Your mission is to enhance Flowith Agents, evolving static algorithms into dynamic collaborators that can reason, plan, and execute tasks in partnership with our users.Key ResponsibilitiesCore Development of Agents: Create and implement the foundational reasoning logic for Flowith Agents, enabling them to manage complex planning, tool invocation, and long-term memory.Collaboration Frameworks: Design and optimize multi-agent collaboration systems that enable AI to efficiently navigate the Flowith Canvas, interacting seamlessly with various nodes and external APIs.Performance Evaluation & Optimization: Set rigorous benchmarks to evaluate Agent performance, systematically analyzing reasoning failures or "hallucinations" and iterating on prompts and logic to enhance reliability.Integration & Tooling: Develop and maintain tools ("skills") for Agents, expanding their capabilities to interact with web services, databases, and third-party SaaS platforms.Research & Development: Stay at the forefront of AI research—exploring new model capabilities and agentic patterns—and rapidly prototype features that ensure Flowith leads in the AI-native landscape.

Mar 10, 2026

Apply

Senior Lead, Research & Evaluation

aiedu

Full-time|On-site|San Francisco, United States

Join aiedu as a Senior Lead in Research & Evaluation, where you will drive impactful research initiatives that shape educational practices and policies. In this role, you will lead a team of researchers in designing and executing comprehensive evaluations that inform our strategic direction. Your expertise will be critical in analyzing data, generating insights, and communicating findings to stakeholders.

Mar 13, 2026

Apply

Tech Lead Manager, Developer Experience

Peregrine Technologies

Full-time|$220K/yr - $275K/yr|On-site|San Francisco, CA

Peregrine Technologies, headquartered in San Francisco, develops an AI platform that transforms fragmented data into actionable intelligence for public safety, government, and enterprise clients. With support from top Silicon Valley investors, Peregrine serves hundreds of organizations across more than 30 states and two countries, impacting over 125 million people. The company is now expanding further into enterprise and international markets. The Developer Experience (DevEx) team plays a central role as Peregrine’s engineering organization rapidly grows from 50 to over 120 members in a year. As the team expands, challenges like slower development cycles, onboarding complexity, and deployment reliability have become more pressing. The DevEx group, a small and influential team of 2 to 3 engineers, focuses on making daily work smoother for internal developers by identifying pain points and delivering tools and processes that improve efficiency across engineering. Role overview The Tech Lead Manager, Developer Experience, leads the team responsible for engineering tooling, developer velocity, observability, and monitoring. This role is hands-on: setting technical direction, mentoring a small group of senior engineers, and collaborating with other teams to address developer productivity challenges. Defining and tracking metrics for engineering effectiveness is a central part of the job, helping guide where to invest resources for the greatest impact. The work of this team directly influences the productivity of an engineering group of more than 100 people. Key responsibilities Own the DevEx roadmap: Define strategy for CI/CD, observability, and developer productivity. Prioritize work based on data and developer feedback. Lead and develop the team: Manage and mentor 2-3 engineers, support their career growth, provide direction, recruit new talent, and raise engineering standards. Establish effectiveness metrics: Instrument workflows and develop metrics such as build times, deployment frequency, developer satisfaction, and onboarding speed. Improve CI/CD and deployment reliability: Maintain efficient, dependable build, test, and deployment pipelines as the company scales. Develop observability and monitoring frameworks: Build systems for real-time insights and operational monitoring.

Apr 23, 2026

Apply

Researcher, Interpretability

OpenAI

Full-time|On-site|San Francisco

About Our TeamJoin the Interpretability team at OpenAI, where we delve into the inner workings of deep learning models. Our mission is to leverage internal representations to gain insights into model behavior and to design models that offer clearer interpretations. We prioritize applying our findings to enhance the safety of advanced AI systems. Our collaborative and inquisitive work culture fosters innovation and exploration.About the PositionOpenAI is on the lookout for a dedicated researcher with a passion for deep learning and a solid engineering background. In this role, you will develop and execute a research agenda focused on mechanistic interpretability, working closely with a team of driven individuals. Your contributions will be vital in ensuring that future AI models remain safe as their capabilities expand, significantly advancing our commitment to creating safe AGI.Key Responsibilities:Conduct and publish research on methods for interpreting the representations of deep networks.Develop infrastructure to analyze model internals on a large scale.Collaborate across various teams to undertake projects uniquely suited to OpenAI’s capabilities.Direct research initiatives towards tangible usefulness and long-term scalability.Ideal Candidate Profile:Passionate about OpenAI’s mission to ensure that AGI benefits all of humanity, and aligned with OpenAI’s charter.Enthusiastic about long-term AI safety and knowledgeable about the technical pathways to achieve safe AGI.Experience in AI safety, mechanistic interpretability, or closely related fields.Possess a Ph.D. or substantial research background in computer science, machine learning, or a related discipline.Excited to engage with large-scale AI systems and utilize OpenAI’s exceptional resources in this domain.Have 2+ years of experience in research engineering and proficiency in Python or similar programming languages.Exhibit a deep curiosity and willingness to explore new ideas.

Jun 15, 2025

Apply

Staff Software Engineer - Agent Development

Decagon

Full-time|On-site|San Francisco

Join Decagon as a Staff Software Engineer specializing in Agent Development. In this pivotal role, you will work on cutting-edge software solutions that enhance our capabilities in agent-based systems. You will collaborate with cross-functional teams to design, develop, and implement innovative software applications that drive our mission forward.

Feb 25, 2026

Apply

Research Engineer

OpenAI

Full-time|On-site|San Francisco

Join us at OpenAI as a Research Engineer, where your innovative ideas will shape the future of artificial intelligence.About the RoleIn this pivotal position, you will be instrumental in developing cutting-edge AI systems that tackle challenges previously deemed insurmountable. We are seeking individuals with exceptional engineering capabilities, particularly in designing and enhancing large-scale distributed machine learning systems, writing efficient machine learning code, and advancing the scientific foundations of our algorithms.The most remarkable outcomes in deep learning are increasingly achieved at scale, necessitating engineers who thrive in expansive distributed systems. Your engineering expertise will be vital to driving significant advancements in AI technology.Key Responsibilities:Demonstrate strong programming and coding proficiencyPossess experience in managing and optimizing large distributed systemsExpress enthusiasm for OpenAI's innovative research methodologiesPreferred Qualifications:Exhibit a thoughtful perspective on the societal impacts of AI technologyBring prior experience in developing high-performance implementations of deep learning algorithmsAbout OpenAIOpenAI is at the forefront of AI research and application, committed to ensuring that general-purpose artificial intelligence serves the greater good of humanity. We strive to extend the limits of AI capabilities while prioritizing safety and human-centric design in our products. Our mission is to embrace diverse perspectives and experiences that enrich our understanding of humanity in the pursuit of our goals.We are proud to be an equal opportunity employer, welcoming applicants from all backgrounds without discrimination. For more information, please refer to OpenAI’s Affirmative Action and Equal Employment Opportunity Policy Statement.Background checks will be conducted in accordance with applicable laws.

Apr 5, 2025

Apply

Staff Machine Learning Research Scientist/Engineer, Agents

Scale AI

Full-time|$275K/yr - $350K/yr|On-site|San Francisco, CA; Seattle, WA; New York, NY

About Scale AI At Scale AI, we are dedicated to propelling the advancement of AI applications. Over the past eight years, we have established ourselves as the premier AI data foundry, supporting groundbreaking innovations in fields such as generative AI, defense technologies, and autonomous vehicles. Following our recent Series F funding round, we are intensifying our efforts to harness frontier data, paving the way toward achieving Artificial General Intelligence (AGI). Our work with enterprise clients and governments has enhanced our model evaluation capabilities, allowing us to expand our offerings for both public and private evaluations. About the ACE Team The Agent Capabilities & Environments (ACE) team, a vital part of Scale’s Research organization, unites customer-focused Researchers and Applied AI Engineers. Our primary mission is to conduct research on agent environments and reinforcement learning reward signals, benchmark autonomous agent performance in real-world contexts, and develop robust data programs aimed at enhancing the capabilities of Large Language Models (LLMs). We are committed to creating foundational tools and frameworks for evaluating models as agents, focusing on autonomous agents that interact dynamically with a wide range of external environments, including code repositories and GUI interfaces. About This Role This position sits at the cutting edge of AI research and its practical applications, concentrating on the data types necessary for the development of state-of-the-art agents, including browser and software engineering agents. The ideal candidate will investigate the data landscape required to propel intelligent and adaptable AI agents, steering the data strategy at Scale to foster innovation. This role demands not only expertise in LLM agents and planning algorithms but also creative problem-solving skills to tackle novel challenges pertaining to data, interaction, and evaluation. You will contribute to influential research publications on agents, collaborate with customer researchers, and partner with the engineering team to transform these advancements into scalable real-world solutions.

Mar 26, 2026

Apply

Lead Research Engineer

Lightning AI

Full-time|$225K/yr - $275K/yr|Hybrid|London, England, United Kingdom; New York, New York, United States; San Francisco, California, United States

Who We AreLightning AI, the innovative force behind PyTorch Lightning, was established in 2019 to create a seamless end-to-end platform for developing, training, and deploying artificial intelligence systems. Our mission is to facilitate the transition from research to production effortlessly.In partnership with Voltage Park, a leading neocloud and AI Factory, Lightning AI merges developer-centric software with optimized, large-scale computing solutions. We empower teams with the necessary tools for experimentation, training, and production inference while ensuring built-in security, observability, and control.We cater to individual researchers, emerging startups, and large enterprises alike. With a global presence, our offices are located in New York City, San Francisco, Seattle, and London, backed by top-tier investors including Coatue, Index Ventures, Bain Capital Ventures, and Firstminute.

Mar 13, 2026

Apply

Machine Learning Researcher - Multimodal LLMs

Bland Inc.

Full-time|On-site|San Francisco

Bland Inc. seeks a Machine Learning Researcher specializing in Multimodal Large Language Models (LLMs) to join the team in San Francisco. The focus is on advancing AI systems that integrate language with other types of data. Role overview This position centers on research and development aimed at improving how AI models process and understand information from multiple sources, such as text combined with images or other modalities. What you will do Investigate how language interacts with additional data types within multimodal LLMs Create and evaluate new methods to enhance AI model performance Work closely with colleagues on projects designed to push the boundaries of machine learning Location This role is based in San Francisco.

Apr 21, 2026

Create account — see all 4,275 results