Lead Data Operations Evaluation Engineering jobs in San Francisco – Browse 7,112 openings on RoboApply Jobs

Lead, Data Operations & Evaluation Engineering

Arcade AIPresidio, CA

On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.

Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Manager

Qualifications

Key Responsibilities:Develop and Execute AI Data Strategy: Design and lead the data strategy for arcade.ai, focusing on the collection and governance of extensive datasets to enhance generative AI model training. AI Data Acquisition & Management: Collaborate with the AI team and CEO to define training data requirements that align with our innovative model development goals.

About the job

Join Arcade AI as a Lead in Data Operations & Evaluation Engineering!

Arcade AI is pioneering the future of e-commerce with our groundbreaking generative AI design platform, enabling consumers, makers, and enterprises to effortlessly create custom products. Our vision is to transform imaginative ideas into tangible products, seamlessly connecting creativity with a streamlined supply chain.

As the Lead, you will play a crucial role in shaping our data strategy, ensuring that our AI models are powered by diverse, high-quality datasets. You’ll oversee the sourcing, organization, and processing of data while implementing robust metrics and tools to evaluate our models' performance.

About Arcade AI

Arcade AI is headquartered in the scenic Presidio of San Francisco, led by visionary entrepreneur Mariam Naficy. Our team comprises talent from industry giants such as Google, Apple, and NVIDIA, all dedicated to creating a new era in personal expression and on-demand manufacturing.

Similar jobs

1 - 20 of 7,112 Jobs

Select all on this page (20)

Apply

Lead, Data Operations & Evaluation Engineering

Arcade AI

Full-time|On-site|Presidio, CA

Join Arcade AI as a Lead in Data Operations & Evaluation Engineering!Arcade AI is pioneering the future of e-commerce with our groundbreaking generative AI design platform, enabling consumers, makers, and enterprises to effortlessly create custom products. Our vision is to transform imaginative ideas into tangible products, seamlessly connecting creativity with a streamlined supply chain.As the Lead, you will play a crucial role in shaping our data strategy, ensuring that our AI models are powered by diverse, high-quality datasets. You’ll oversee the sourcing, organization, and processing of data while implementing robust metrics and tools to evaluate our models' performance.

Jan 6, 2026

Apply

Lead Data Operations Specialist

Sieve

Full-time|On-site|San Francisco

About UsSieve is an innovative AI research laboratory dedicated exclusively to video data. We leverage exabyte-scale video infrastructure and cutting-edge video understanding techniques to create expansive datasets that advance the field of video modeling. With video comprising 80% of internet traffic, it has become a crucial digital medium that drives creativity, communication, gaming, AR/VR, and robotics. Sieve exists to address the critical challenge of providing high-quality training data, which is essential for the growth of these applications.As a testament to our success, we have partnered with leading AI labs and achieved significant revenue growth, generating $XX million last quarter with a compact team of just 15 professionals. Our Series A funding last year was secured from top-tier firms including Matrix Partners, Swift Ventures, Y Combinator, and AI Grant.About the RoleIn your role as the Data Operations Lead, you will be responsible for the daily execution and expansion of Sieve's data operations platform. This position is a blend of operational and semi-technical responsibilities. You will oversee our human workforce, enhance quality assurance processes, manage talent sourcing and onboarding, and spearhead product operations initiatives aimed at optimizing our platform's efficiency. A key focus of this role will be on growth; you will implement campaigns and experiments to broaden the platform's user base, identify new sourcing channels, and drive user adoption. This position is ideal for individuals who are both creators and optimizers—capable of engaging deeply with tools while also strategically considering how to scale complex operational systems.What You'll DoOversee and scale Sieve's internal data operations platform, which includes workforce management, task assignments, and quality assurance workflows.Drive growth initiatives for the platform by executing acquisition campaigns, testing new sourcing channels, and creatively expanding the user base.Recruit, onboard, and manage a distributed human workforce dedicated to data annotation, curation, and quality review processes.Develop and enhance quality assurance processes to guarantee that data outputs align with the standards required by leading AI laboratories.Lead product operations for the data platform by collaborating with engineering teams to implement tooling improvements, track operational metrics, and identify inefficiencies.Create comprehensive documentation, standard operating procedures, and training materials to support the operational framework.

Mar 11, 2026

Apply

Lead Data Engineer

Arine

Full-time|Remote|Remote (United States of America)

Located in San Francisco, Arine is an innovative and rapidly expanding healthcare technology and clinical services firm dedicated to delivering the safest and most effective treatments tailored to the evolving needs of individuals.In the healthcare landscape, medications can sometimes do more harm than good. The mismanagement of drugs and dosages results in over $528 billion wasted annually in the U.S. healthcare system. Arine is transforming healthcare standards by addressing these challenges through our advanced software platform (SaaS). We leverage state-of-the-art data science, machine learning, AI, and deep clinical knowledge to create a patient-centered approach to medication management, facilitating the development and delivery of personalized care plans at scale for patients and their healthcare teams.At Arine, we are passionate about enhancing the lives and health of complex patients who significantly influence healthcare costs and have been historically difficult to reach. These individuals face a myriad of challenges, including complex medication prescriptions across various providers, chronic disease medication management, and barriers to accessing care. Supported by top healthcare investors and partnerships with leading healthcare organizations, we provide actionable recommendations and initiate clinical interventions that yield substantial health improvements for patients and cost savings for our clients.Why Choose Arine as Your Workplace?:Collaborative Team Culture - Our shared mission drives our collective efforts, inspiring us to excel in our work. We are passionately committed to the innovation necessary to lead in medication intelligence.Impactful Contribution to Healthcare - We are saving lives and empowering individuals to achieve better health outcomes.

Jan 7, 2026

Apply

Senior Lead, Research & Evaluation

aiedu

Full-time|On-site|San Francisco, United States

Join aiedu as a Senior Lead in Research & Evaluation, where you will drive impactful research initiatives that shape educational practices and policies. In this role, you will lead a team of researchers in designing and executing comprehensive evaluations that inform our strategic direction. Your expertise will be critical in analyzing data, generating insights, and communicating findings to stakeholders.

Mar 13, 2026

Apply

Revenue Technology - Data Strategy & Operations Lead

Mercury

Full-time|$142.6K/yr - $198K/yr|Remote|San Francisco, CA, New York, NY, Portland, OR, or Remote within Canada or United States

At Mercury, we are revolutionizing the banking experience for ambitious companies. Our innovative financial platform is supported by a robust data system that users can trust.As we continue to scale, our revenue systems produce a wealth of information: insights from both remote and in-person engagements, automation tools, product usage metrics, lifecycle events, and analytics pipelines. Transforming this activity into clear and actionable intelligence—without fragile pipelines or the need for constant rework—is essential to our growth strategy.We are seeking a Data Strategy & Operations Lead to take charge of the data foundations that drive our revenue execution. This pivotal role will ensure that our revenue data remains reliable, interpretable, scalable, and actionable as our business evolves, empowering teams to make confident decisions based on data.In this position, you will report directly to the Head of Platforms & Infrastructure and will play a key role in shaping how Mercury models, governs, and operationalizes go-to-market (GTM) data. You will collaborate closely with teams in Data Engineering, Data Science, Solution Architecture, and Platform Engineering.

Feb 17, 2026

Apply

Evaluation Engineer

Braintrust

Full-time|Remote|San Francisco

Join our dynamic team as an Evaluation Engineer at Braintrust, a leading talent network that empowers companies to harness the expertise of top talent. In this role, you will be responsible for developing and implementing evaluation frameworks to assess various projects and initiatives. You will work closely with cross-functional teams to ensure alignment with our strategic objectives and contribute to data-driven decision-making processes.

Mar 13, 2026

Apply

Data Center DCIM Program Lead - Infrastructure Operations

Cloudflare, Inc.

Full-time|Hybrid|Hybrid

Join Cloudflare as a Data Center DCIM Program Lead in our Infrastructure Operations team, where you will play a pivotal role in optimizing our data center operations. This position offers an exciting opportunity to lead the implementation of Data Center Infrastructure Management (DCIM) programs, ensuring our data centers operate at peak efficiency.

Feb 6, 2026

Apply

Lead of Data Center Hardware Operations - Stargate

OpenAI

Full-time|On-site|San Francisco

About Our TeamAt OpenAI, we are on a mission to construct the most sophisticated AI infrastructure ecosystem in collaboration with our investment partners. Our team is pivotal in defining and executing our foundational infrastructure strategy, engaging in every phase from site selection to complete buildout. We operate at the crossroads of commercial interests, technical innovation, strategic planning, and operational execution, collaborating closely with both internal teams and external partners.About the PositionWe are looking for an experienced Data Center Hardware Operations Lead to drive our hardware and network operations as well as logistics, boasting over 15 years of experience in managing intricate, mission-critical data center environments. You will be responsible for overseeing and optimizing the physical operations of our hardware on a daily basis, which includes material movement and hardware maintenance across our growing global presence. This role involves designing scalable systems for repairs and logistics, managing vendor relations, and ensuring seamless coordination among facilities, supply chain, and engineering teams. If you excel in fast-paced, complex environments and have a proven history of operational excellence at scale, we invite you to apply.Key ResponsibilitiesWork alongside internal and external teams to establish a comprehensive hardware operations strategy, critical performance metrics, and service level agreements (SLAs).Oversee daily physical operations of our data center campuses, from the commissioning phase through ongoing maintenance.Design and implement effective logistics systems for material movement, repairs, and operational workflows.Collaborate with engineering, construction, supply chain, and operations teams to streamline processes and eliminate bottlenecks.Develop tools and methodologies that enhance traceability, throughput, and vendor coordination.Manage relationships with external partners to ensure alignment with operational objectives.Implement best practices in logistics and operational planning to support scalable infrastructure growth.QualificationsMinimum of 15 years of experience in physical operations and logistics within mission-critical infrastructure and data centers.Demonstrated ability to manage complex logistics systems and coordinate across multiple disciplines.In-depth knowledge of operational processes for large-scale facilities, including maintenance, construction support, and warehousing.Experience in leading cross-functional initiatives and collaborating with third-party vendors.

Feb 12, 2026

Apply

Lead Engineer - Data and Integrations

onoshealth

Full-time|On-site|San Francisco

As a Lead Engineer specializing in Data and Integrations at onoshealth, you will play a pivotal role in driving innovation and improving healthcare solutions through effective data management and system integrations. You will lead a team of talented engineers, oversee complex projects, and work closely with stakeholders to ensure the delivery of high-quality solutions that meet our clients' needs.

Mar 26, 2026

Apply

AI Evaluation Engineer

distyl

Full-time|Remote|San Francisco

distyl seeks an AI Evaluation Engineer based in San Francisco. This position centers on assessing artificial intelligence systems, measuring how well models perform, and guiding the process for testing and refining products. Role overview The main focus is to evaluate AI models for accuracy and reliability. The role involves shaping and maintaining testing protocols for both new and existing systems. Collaboration is key, as you will work with teams across the company to help ensure that AI outputs consistently meet quality standards. What you will do Assess AI models to determine their accuracy and reliability Create and update testing protocols for a range of systems Partner with teams throughout the organization to uphold quality benchmarks for AI outputs Requirements Keen attention to detail Interest in artificial intelligence and its real-world uses Comfort working with colleagues from diverse backgrounds

Apr 23, 2026

Apply

Operations Lead, Forward Deployed Engineering

OpenAI

Full-time|Hybrid|San Francisco

About Our TeamThe Forward Deployed Engineering (FDE) team at OpenAI collaborates with clients to transform cutting-edge research into robust, production-ready AI systems. Positioned at the nexus of Product, Engineering, Research, and Go-To-Market strategies, we integrate closely with users to tackle significant challenges and identify trends that influence our platform. Our mission is to deploy advanced capabilities in real-world scenarios and distill customer feedback into sustainable solutions, consistent patterns, and product strategies.About the PositionWe are seeking an Operations Lead to establish, manage, and enhance the systems that empower the FDE team to operate efficiently at scale. This pivotal role directly impacts our ability to implement state-of-the-art AI technologies in practical applications. You will leverage rapid insights from the field and business operations to formulate clear operational strategies, aligning project demands with team capacity, driving staffing decisions, and ensuring predictable scaling of our portfolio.In this role, you will collaborate closely with Business, Product, and Go-To-Market stakeholders to enhance our prioritization, planning, and coordination processes. Instead of overseeing a single program, you will facilitate essential operational rhythms for the team, including portfolio reviews, execution tracking, and quarterly planning, ensuring that leaders have transparent visibility into risks and progress as the organization expands. This is a senior individual contributor role with extensive responsibilities across the FDE operational framework.This position is available in San Francisco or New York City, utilizing a hybrid work model that includes three days in the office per week, with relocation assistance offered to new hires.

Mar 10, 2026

Apply

Growth Strategy Lead

Rox Data Corp

Full-time|On-site|San Francisco

Join Our Team at RoxAt Rox, we are pioneering the AI-native revenue operating system designed specifically for modern go-to-market teams. Supported by influential investors such as Sequoia, GV, and General Catalyst, we are collaborating with dynamic enterprise teams to transform fragmented CRM workflows into intelligent, autonomous systems. Our platform seamlessly connects data across the GTM stack, employs AI agents that execute real tasks, and provides revenue leaders with a clear, unified understanding of the elements that drive success.As a nimble Series A startup, we are taking on one of the most established categories in software, and we are succeeding by merging deep technical expertise with an unwavering focus on practical value.About the Growth TeamThe Growth team at Rox is a critical function that operates at the intersection of Sales, Product, Marketing, and Data, dedicated to converting insights into action. This team is responsible for identifying areas of potential leverage, creating scalable growth systems, and executing strategies with speed and accuracy.Currently, our Growth Strategy team is actively engaged with five talented individuals driving pipeline growth, experimentation, and go-to-market execution. This role is designed to lead that team, elevate performance standards, and expand both strategic initiatives and team capabilities.About the RoleWe are looking for a Growth Strategy Lead who can not only lead by example but also build a strong team around them.In this role, you will take charge of a five-member Growth Strategy team, setting the strategic direction, mentoring team members, and ensuring that every initiative directly contributes to revenue growth. Collaborating closely with the Chief Growth Officer, you will act as a strategic partner in shaping priorities and scaling Rox's go-to-market efforts.This is an active leadership position. You will be hands-on in designing growth strategies, validating concepts, and stepping in as needed, while also establishing the systems, processes, and team dynamics required to maintain sustainable growth.If you thrive in a role that encompasses both strategic oversight and tactical execution, and you desire true ownership over your work, this position is tailored for you.Key ResponsibilitiesLead and oversee a team of five Growth Strategists, establishing clear objectives, priorities, and expectations.Collaborate directly with the Chief Growth Officer to develop and implement Rox’s growth strategy.Design, launch, and refine go-to-market initiatives focused on pipeline creation, activation, and expansion.Translate product features into actionable growth strategies and replicable playbooks.Work collaboratively across Sales, Product, Marketing, and Data to ensure execution alignment.

Feb 5, 2026

Apply

Facilities Operations Lead - Stargate

OpenAI

Full-time|On-site|San Francisco

About Our TeamAt OpenAI, we are on an ambitious mission, collaborating closely with our capital partners to create the most advanced AI infrastructure ecosystem in the world. Our Infrastructure team plays a pivotal role in this endeavor, formulating core strategies and bringing our vision to life. This team operates at the nexus of commercial, technical, and operational fields, engaging with experts and executives both within and beyond OpenAI. We are dedicated to designing and managing mission-critical facilities that support high-performance AI workloads at scale.Role OverviewWe are looking for a dynamic Facilities Operations Lead to champion the commissioning, rollout, and sustained operation of our next-generation AI data centers. This position serves as a critical link between data center construction and hardware integration, ensuring a smooth transition of mission-critical infrastructure in alignment with hardware deployment schedules. You will be responsible for defining and executing commissioning plans, facilitating infrastructure setup, and overseeing operations and maintenance for our state-of-the-art, large-scale AI data centers.In this role, you will work closely with design, construction, and hardware teams to establish repeatable processes for new data center builds and take the lead in hands-on operations to maintain the performance and reliability of our deployed infrastructure.Key ResponsibilitiesDevelop and implement sequences of operations, commissioning protocols, and bring-up processes for mission-critical data center facilities.Collaborate with design and hardware teams to create tailored deployment procedures for various data center and hardware configurations.Manage the installation, commissioning, and operational readiness of expansive data center campuses.Oversee monitoring, maintenance, and quality assurance of data center infrastructure, including high-performance liquid cooling systems.Formulate an on-site operations staffing strategy.Establish and enforce protocols for planned and unplanned downtime and service level agreements for critical facility components.QualificationsA minimum of 10 years of experience in large-scale data center facility operations, commissioning, or critical infrastructure engineering.Extensive knowledge of liquid-cooled IT systems, including CDU and in-rack/in-row manifold design, setup, and maintenance.Strong background in managing complex operational tasks in high-demand environments.

Mar 27, 2026

Apply

Lead Researcher for Evaluations at Cartesia | San Francisco, CA

Cartesia

Full-time|On-site|*HQ - San Francisco, CA

About CartesiaAt Cartesia, we are on a mission to revolutionize artificial intelligence by creating interactive, ubiquitous intelligence that operates seamlessly wherever you are. Current AI models struggle to continuously process and reason over extensive streams of data, including a year’s worth of audio, video, and text. Our innovative team is developing advanced model architectures to overcome these challenges.Founded by PhDs from the Stanford AI Lab who pioneered State Space Models, we blend deep expertise in model innovation with a design-focused engineering approach. With backing from top-tier investors such as Index Ventures and Lightspeed Venture Partners, along with a network of industry-leading advisors, we are pushing the boundaries of AI.About the RoleJoin our New Horizons Evaluations team as the Evaluations Lead, where you will redefine how we measure progress in interactive machine intelligence. You will create evaluation frameworks that assess not only what models know but also how they reason, remember, and engage over time. This multifaceted role bridges research, product development, and infrastructure to establish metrics and systems that articulate the essence of “intelligence” in the next wave of AI. Ideal candidates will possess a blend of scientific rigor and technical prowess, alongside a genuine curiosity about user interactions with intelligent systems. Your contributions will be pivotal in shaping Cartesia’s model development, focusing on deeper qualities such as understanding, naturalness, and adaptability in real-world applications.Your ImpactDefine and identify essential model capabilities and behaviors for next-generation evaluations.Develop and implement comprehensive evaluation pipelines with robust statistical analysis and transparent reporting.Collaborate closely with model training and research teams to integrate evaluation systems into the model development process.Design and prototype user studies and behavioral experiments to ground evaluations in practical use.

Oct 21, 2025

Apply

Machine Learning Evaluations Engineer

Exa

Full-time|On-site|San Francisco, California

At Exa, we are pioneering the next generation of search engines designed for the era of artificial intelligence, starting from the foundational Silicon architecture. Our ambitious indexing operation is unparalleled, allowing us to crawl the vast open web at an extraordinary scale. We harness cutting-edge embedding models to comprehend this data and utilize our high-performance Rust-based vector database alongside a $5M H200 GPU cluster, which powers tens of thousands of machines simultaneously.The Machine Learning (ML) division is central to this mission, focusing on the training of foundational models that enhance search capabilities. Our vision is to create systems capable of swiftly filtering the world’s knowledge to deliver precisely what you need, regardless of the complexity of your inquiry—effectively transforming the web into a robust, searchable database.To achieve this ambitious goal, we must define what constitutes “effective search”. This is where your expertise will play a crucial role.We are seeking a talented Machine Learning Evaluations Engineer to develop and implement our evaluation framework at Exa. This position entails exploring methodologies to assess search engines in a world dominated by large language models (LLMs) and crafting the most thorough, innovative, and impactful evaluation suite. Your decisions will influence the future of search optimization and directly affect the research team’s focus, shaping the company’s strategic direction.

Oct 15, 2025

Apply

Strategic Project Lead – Growth & Operations

Datacurve

Full-time|On-site|San Francisco

Company OverviewDatacurve is revolutionizing data generation for AI model development with our innovative gamified platform, Shipd. We specialize in delivering high-quality coding data and have rapidly grown our annualized run rate to nearly eight figures within a year, achieving remarkable profit margins of 80-90%. Recently, we secured $15 million in Series A funding at a $150 million valuation, expanding our team from 5 to 10 members in just a couple of months. Our distinguished investors include Y Combinator, leaders from Google DeepMind, and notable executives from Anthropic, OpenAI, Vercel, Replit, Cohere, and Redis.At Shipd, we transform data-creation tasks into lucrative opportunities by offering paid bounties for contributors. We manage both large-scale contracts and shorter projects, allowing contributors to engage in quests to earn rewards. We are actively looking for a Growth Specialist to enhance our contributor base and contractor network.Why You Should Join UsTake on a crucial role in developing the infrastructure that fuels the future of AI.Drive strategy and execution in a vital growth area for the company.Collaborate with experienced founders, operators, and technology experts.Be part of a mission-driven team operating in a dynamic, impactful environment.Your RoleYou will craft and lead the strategy to expand our contributor community from 16,000 to over 100,000, focusing on attracting and engaging vetted expert contributors, particularly in coding. Additionally, you'll establish systems for outreach, tracking, management, and retention of high-performing contractors, ensuring we can meet the demands of technical and specialized annotation projects.Your ResponsibilitiesGrowth & Contributor AcquisitionDesign and implement multi-channel campaigns to attract expert contributors in specialized fields.Create innovative acquisition strategies, including coding challenges, referral initiatives, university partnerships, and community-driven content growth.Engage with relevant communities through platforms such as Discord, Slack, newsletters, and more.

Jul 28, 2025

Apply

Research Engineer in Model Evaluations

Anthropic

Full-time|Remote|Remote-Friendly (Travel-Required) | San Francisco, CA | New York City, NY

Anthropic is looking for a Research Engineer focused on model evaluations. This position involves research and development to assess and strengthen the performance of AI models. Teams are based in San Francisco and New York City, and the role supports remote work with required travel. Key responsibilities Design and implement evaluations for Anthropic's AI models Collaborate with team members to enhance model performance Contribute to research that pushes the boundaries of AI systems Location Remote-friendly (travel required) San Francisco, CA New York City, NY

Apr 28, 2026

Apply

Operations Lead

Sazabi

Full-time|On-site|San Francisco

Join Our MissionAs we approach 2026, we're facing what we call the 'infinite software crisis.' At Sazabi, we are pioneering solutions to support, maintain, and operate the rapid expansion of application development.Introducing Sazabi: the AI-native observability platform designed for agile engineering teams.Sazabi empowers teams to inquire about their production systems in straightforward language, automatically visualize system activity, and identify root causes ten times faster. Say goodbye to cumbersome instrumentation, complicated dashboard setups, and tedious alert configurations. Just straightforward answers.We are proud to be supported by industry leaders from renowned AI companies including Vercel, Graphite, Daytona, Browserbase, LangChain, Mastra, Replit, and others.Your RoleTake ownership of complex, critical problems that span multiple functions.Collaborate directly with founders to swiftly identify and resolve bottlenecks.Establish and maintain the internal systems that keep Sazabi thriving (people operations, recruitment processes, financial operations, tooling, etc.).Design and implement streamlined processes that facilitate scalability without hindering speed.Engage in crucial projects across hiring, go-to-market strategies, product development, and customer support.Enhance our operational efficiency and clarity as we expand.Transform disorder into dynamic progress.Who You AreExhibit a high degree of agency.Possess exceptional problem-solving abilities in both technical and non-technical realms.Thrive in ambiguous situations with ever-evolving priorities.Demonstrate proactive capability in developing systems, processes, and workflows from the ground up.Be highly organized while steering clear of bureaucracy.Show strong communication skills and sound judgment.Be ready to tackle any task that the company requires, regardless of how unglamorous it may be.If you're enthusiastic about Twitter and agents, you'll fit right in!What We ProvideAttractive salary and equity options.Complimentary lunches (in-office only).Comprehensive health, dental, and vision insurance.Unlimited paid time off.Paid parental leave.

Mar 23, 2026

Apply

Staff Machine Learning Research Scientist - LLM Evaluations

Scale AI

Full-time|$280K/yr - $380K/yr|On-site|San Francisco, CA; Seattle, WA; New York, NY

At Scale AI, we are the premier partner for data and evaluation in the rapidly evolving field of artificial intelligence. Our commitment to advancing the assessment and benchmarking of large language models (LLMs) positions us at the forefront of AI innovation. We are dedicated to creating leading-edge LLM evaluation methodologies that set new benchmarks for model performance. Our research teams collaborate with the top AI laboratories in the industry to provide high-quality data, accelerate progress in generative AI research, and inform what excellence looks like in this domain. As a Staff Machine Learning Research Scientist on our LLM Evals team, you will spearhead the creation of novel evaluation methodologies, metrics, and benchmarks to assess the strengths and weaknesses of cutting-edge LLMs. Your work will shape our internal strategies and influence the broader AI research community, making this role essential for establishing best practices in data-driven AI development.

Mar 26, 2026

Apply

Team Lead Backend Engineer for Data and API Solutions

HockeyStack

Full-time|$200K/yr - $240K/yr|Remote|San Francisco

Join HockeyStack, where we are revolutionizing B2B sales and marketing through advanced AI systems.We seamlessly integrate with a multitude of tools across the go-to-market stack, merging first and third-party data to create the industry's most comprehensive event-level data foundation. Our capabilities include:Analyzing every deal won and lost to uncover revenue-driving patterns, enabling real-time guidance for sales representatives.Identifying key factors influencing the sales pipeline, providing immediate insights without the need for extensive manual analysis.Learning from every outcome and interaction, continuously improving our system's performance and accuracy.Unlike others, our structured reasoning layer ensures deterministic and self-improving agent execution. As a Y Combinator alum with a $26M Series A led by Bessemer Venture Partners, we are experiencing rapid growth with an 8-figure ARR and over 60 TB of GTM data processed. We are looking for dynamic individuals ready to succeed. This position can be based in our San Francisco HQ or be fully remote. Your MissionAs the Backend Tech Lead for the Data API team, you will design and scale our core API infrastructure, which is critical for Marketing, Sales, and Customer Success operations. You will oversee the engineering of services that manage over 60 TB of GTM data, ensuring the delivery of low-latency, high-availability APIs that provide valuable buyer insights to our internal applications and customer-facing dashboards.This is an exceptional opportunity for an engineer eager to take ownership of large-scale data and API infrastructure while fostering engineering excellence and rapid deployment in a fast-paced environment. What You’ll DoLead the design and development of APIs to share GTM data with both internal and external users.Oversee the architecture for large-scale data access patterns, focusing on performance, cost-effectiveness, and reliability.Work closely with AI, data, and frontend teams to ensure smooth integrations and clear API contracts.Address complex challenges including versioning, rate-limiting, authentication, query complexity, and cache invalidation in a rapidly evolving product landscape.

Feb 24, 2026

Create account — see all 7,112 results

1 - 20 of 7,112 Jobs

Select all on this page (20)

Apply

Lead, Data Operations & Evaluation Engineering

Arcade AI

Full-time|On-site|Presidio, CA

Jan 6, 2026

Apply

Lead Data Operations Specialist

Sieve