Engineering Manager - AI Cloud Platform

LambdaSan Francisco Office

On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.

Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Manager

Qualifications

What You Will DoLead the AI Cloud Core Platform team of approximately six engineers, overseeing all aspects of Cloud Platform and governance capabilities. Drive the execution of roadmap features, including cluster lifecycle automation. Collaborate closely with Product and Design teams to ensure the user experience aligns with the needs of enterprise customers. Balance rapid feature delivery with strategic investments in scalability, observability, and platform design. Recruit, mentor, and cultivate a team of engineers, providing guidance and career development. Work in tandem with other Lambda teams (Control Plane, Billing, Platform) to guarantee seamless and integrated delivery across the stack. Foster a culture of high performance, documentation, humility, and curiosity. Maintain a product-focused approach in leadership and execution, prioritizing customer needs with an emphasis on feature velocity, reliability, and security. Shape a culture of sustainable, empathetic, and high-velocity engineering, emphasizing cross-team collaboration, documentation, and data-driven decision-making.

About the job

Join us in our quest to build the world’s leading AI cloud platform.

Note: This role mandates in-office presence in our San Francisco location four days a week; Lambda’s designated remote work day is Tuesday.

As an Engineering Manager at Lambda, you will lead the charge in developing and scaling our cloud offerings, which encompass the Lambda website, cloud APIs, and internal tools for deployment, management, and maintenance.

About Lambda

At Lambda, we strive to revolutionize the AI cloud landscape by providing cutting-edge solutions that empower users and organizations alike. Our innovative approach and commitment to excellence are what set us apart in the rapidly evolving technology sector.

Similar jobs

1 - 20 of 9,610 Jobs

Search for Engineering Manager Ai Observability Evaluations Platform

9,610 results

Select all on this page (20)

Apply

Engineering Manager, AI Observability & Evaluations Platform

LangChain

Full-time|$200K/yr - $250K/yr|On-site|San Francisco, CA

About Us:At LangChain, we are dedicated to making intelligent agents a standard part of everyday life. Our goal is to provide the essential framework for agent engineering, empowering developers to transition their ideas from prototypes to production-ready AI agents that teams can trust. Initially launched as a widely embraced open-source initiative, our evolution has led us to offer a robust platform tailored for building, evaluating, deploying, and managing agents at scale.Our platforms, including LangChain, LangGraph, LangSmith, and Agent Builder, are now instrumental for teams delivering innovative AI solutions across diverse sectors, from startups to major corporations. Industry leaders such as Replit, Clay, Coinbase, Workday, Lyft, Cloudflare, Harvey, Rippling, and Vanta, along with 35% of the Fortune 500, rely on LangChain for their AI initiatives.Having successfully secured $125M in Series B funding from prominent investors like IVP, Sequoia, Benchmark, CapitalG, and Sapphire Ventures, we are poised for continued growth and innovation. At LangChain, every team member plays a vital role in shaping our projects and collaborative work environment, making it a place where your input can significantly influence the future of technology.About The Role:We are seeking a dynamic Engineering Manager to spearhead the development of LangSmith, our observability and evaluation platform designed for LLM applications. In this role, you will set the technical vision, cultivate and mentor a high-performing engineering team, and collaborate closely with product and design teams to deliver features that enable developers to construct and deploy reliable AI systems with assurance.You will: Build, mentor, and expand a talented team of engineers, fostering a culture of collaboration, ownership, and accountability.Enhance LangChain’s engineering culture through mentorship, commitment to high-quality code, and technical excellence.Define long-term technical strategy and guarantee the scalability and reliability of the LangSmith AI Observability Platform.Work alongside product and design teams to outline project scope, sequence, and success metrics for key initiatives.Uphold a high standard of technical excellence while ensuring the team remains focused and operates with urgency.Lead by example in producing clean, maintainable, and thoroughly tested code using Go/Python and TypeScript.Engage directly with customers to grasp their needs and translate those insights into actionable product enhancements.

Feb 6, 2026

Apply

FullStack Engineer - Observability & Evals Platform at LangSmith | San Francisco

LangChain

Full-time|$125K/yr - $145K/yr|On-site|San Francisco, CA

About Us:At LangChain, we are dedicated to making intelligent agents a fundamental part of everyday technology. Our mission is to provide the essential tools for agent engineering in practical applications, enabling developers to transition seamlessly from initial prototypes to production-ready AI agents that organizations can depend on. Starting as a suite of widely adopted open-source tools, we have expanded to offer a comprehensive platform for building, evaluating, deploying, and managing AI agents at scale.Currently, our platforms, including LangChain, LangGraph, LangSmith, and Agent Builder, are trusted by teams developing real AI solutions in both startups and established enterprises. Our technology powers AI initiatives for renowned companies such as Replit, Clay, Coinbase, Workday, Lyft, Cloudflare, Harvey, Rippling, Vanta, and 35% of the Fortune 500.With $125M raised in Series B funding from IVP, Sequoia, Benchmark, CapitalG, and Sapphire Ventures, we are at an exciting juncture where we continue to innovate, grow rapidly, and every team member can make a significant impact on our products and collaboration. Join us at LangChain, where your contributions can reshape the technology landscape.About the Role:In-person, 5 days a week in San FranciscoWe are seeking a Fullstack Engineer to join our LangSmith product team, focusing on our commercial AI observability and evaluation platform. In this position, you will have the opportunity to develop new features and capabilities for our platform while collaborating closely with enterprise clients, developer end-users, and internal stakeholders.Your Responsibilities:Design and implement critical product features utilizing our Go, Python, and TypeScript stackWork in close partnership with product and design teams to refine features and enhance the product roadmapDrive project timelines effectively while maintaining high engineering standards through clean, maintainable, and well-tested codeTo Succeed in This Role:2+ years of experience in software engineering, particularly with complex platform productsFullstack engineering experience with Go or Python on the backend and React + TypeScript on the frontendStrong understanding of database systems, especially Postgres and RedisExperience in designing and scaling APIs, ideally in high-performance environments

Aug 15, 2025

Apply

Senior Frontend Engineer for AI Observability & Evals Platform

LangChain

Full-time|$175K/yr - $225K/yr|On-site|San Francisco, CA

About Us:At LangChain, we are dedicated to making intelligent agents a common part of everyday technology. Our goal is to provide a robust foundation for agent engineering that empowers developers to transition from prototypes to production-ready AI agents that teams can depend on. Initially starting as a widely embraced open-source toolset, we have expanded our offerings to include a comprehensive platform for the building, evaluating, deploying, and managing of agents at scale.Currently, our tools—LangChain, LangGraph, LangSmith, and Agent Builder—are utilized by teams developing real AI products in both startups and large enterprises. Millions of developers rely on LangChain to power AI initiatives at notable companies such as Replit, Clay, Coinbase, Workday, Lyft, Cloudflare, Harvey, Rippling, Vanta, and 35% of the Fortune 500.Having secured $125M in Series B funding from leading investors like IVP, Sequoia, Benchmark, CapitalG, and Sapphire Ventures, we are in an exciting phase of product development and rapid growth, where every team member has a substantial impact on our projects and collaborative efforts. At LangChain, your contributions will play a crucial role in shaping how this technology manifests in the real world.About the Role:This position requires in-person attendance 5 days a week in San Francisco, CA, as well as options in New York and Boston.We are seeking a seasoned frontend engineer to innovate and improve features on LangSmith, our enterprise platform designed for LLM application observability, testing, and debugging.What You Will Do:Create new user-facing features utilizing React and TypeScript.Develop reusable components and front-end libraries for future projects.Convert designs and wireframes into high-quality, maintainable code.Optimize components for peak performance across diverse web-capable devices and browsers.Collaborate with fullstack and backend developers as well as UX/UI designers to enhance usability and experience.You’re a Good Fit If You Have:Extensive frontend engineering experience, with strong command of React, JavaScript, and TypeScript.Practical experience with frontend development tools such as Babel, Vite, Webpack, NPM, and Yarn.Familiarity with REST APIs and experience collaborating closely with fullstack and backend developers.

Jun 9, 2025

Apply

Senior Fullstack Engineer for AI Observability & Evals Platform

LangChain

Full-time|$175K/yr - $225K/yr|On-site|San Francisco, CA

About Us:LangChain is dedicated to making intelligent agents commonplace. We are pioneering the foundations of agent engineering in the real world, empowering developers to transition from prototypes to production-ready AI agents that teams can depend on. Initially known for our widely embraced open-source tools, we have expanded to provide a comprehensive platform for constructing, assessing, deploying, and managing agents at scale.Our products, including LangChain, LangGraph, LangSmith, and Agent Builder, are utilized by teams delivering genuine AI solutions in both startup environments and large corporations. Millions of developers trust our technology to elevate AI initiatives at organizations such as Replit, Clay, Coinbase, Workday, Lyft, Cloudflare, Harvey, Rippling, Vanta, and 35% of the Fortune 500.With $125M raised in our Series B funding from IVP, Sequoia, Benchmark, CapitalG, and Sapphire Ventures, we are poised for continued product development and accelerating growth, where each team member plays a significant role in shaping our technology and collaborative culture.About the Role:On-site 5 days a week in San FranciscoWe are seeking a Senior Fullstack Engineer for our commercial product, LangSmith, which serves as an observability and evaluation platform. In this role, you will have the chance to influence the technical direction of our platform while engaging with enterprise clients, developer end-users, and internal stakeholders.Lead the technical architecture and implementation of essential product features for LangSmith, utilizing our entire stack of Go, Python, and TypeScript.Work closely with product and design teams to iterate and refine new features.Mentor and support junior team members, driving ambitious project timelines while upholding high engineering standards.Set an example by producing clean, maintainable, and thoroughly tested code.

Feb 19, 2025

Apply

AI Observability Research Engineer

Anthropic

Full-time|$320K/yr - $405K/yr|On-site|San Francisco, CA

About AnthropicAt Anthropic, we are dedicated to developing AI systems that are reliable, interpretable, and controllable. Our mission is to ensure that artificial intelligence remains safe and beneficial for individuals and society at large. Our rapidly expanding team comprises passionate researchers, engineers, policy experts, and business leaders collaborating to create positive AI solutions.About the TeamAs the scale of AI training and deployment increases, so does the volume of data that requires monitoring and comprehension. Our team utilizes Claude to interpret this data effectively. We manage an integrated suite of tools that empowers Anthropic to pose open-ended inquiries, identify unexpected patterns, and maintain significant human oversight over extensive datasets.Our tools are widely utilized internally, driving ongoing enforcement, threat intelligence investigations, model audits, and much more. We are seeking skilled engineers and researchers to enhance existing applications and innovate new ones from the ground up.About the RoleAs a Research Engineer on our team, you will design and develop systems that enable AI to analyze vast, unstructured datasets—think tens or hundreds of thousands of conversations or documents—and generate structured, reliable insights. You will engage with the entire technology stack, from foundational analysis frameworks to user-facing applications and interfaces.This is a high-impact position. The tools you create will be utilized by numerous researchers and investigators, directly influencing our capacity to assess and counteract both misuse and misalignment.

Feb 20, 2026

Apply

Senior Software Engineer - Cloud Availability Platform Engineering (Observability)

Crusoe

Full-time|$166K/yr - $201K/yr|On-site|San Francisco, CA - US

At Crusoe, we are on a mission to accelerate the availability of energy and intelligence. We are building the foundational technology that empowers individuals to innovate boldly with AI while maintaining speed, scale, and sustainability.Join us in the AI revolution with sustainable technology at Crusoe, where you will lead significant innovations, make a real impact, and collaborate with a team that is pioneering responsible and transformative cloud infrastructure.About the Role:We are seeking a highly proficient engineer with extensive experience in designing and managing observability platforms at scale. You will be responsible for architecting, developing, and operating Crusoe’s next-generation observability stack, which will allow engineers to gain insights into the internal state of distributed systems through metrics, logs, and traces. Your contributions will guarantee reliability, performance, and actionable insights across Crusoe’s global infrastructure and cloud platform.Key Responsibilities:Design and manage scalable observability systems (metrics, logging, tracing) in multi-datacenter Kubernetes environments.Architect comprehensive telemetry pipelines, covering ingestion, storage, querying, and visualization.Enhance monitoring and alerting mechanisms with Prometheus, Alertmanager, Thanos/Cortex, Grafana, and OpenTelemetry.Develop scalable log collection and processing pipelines utilizing Fluent Bit, Vector, Loki, or ELK/Opensearch stacks.Implement distributed tracing platforms (Tempo, Jaeger, OpenTelemetry) and integrate with service meshes, load balancers, and APIs.Establish and promote the adoption of SLOs, SLIs, and error budgets across various services and teams.Automate the provisioning and scaling of observability infrastructure using Kubernetes, Terraform, and custom tools (Go, Python).Ensure the reliability and cost-effectiveness of telemetry pipelines while supporting high-volume workloads (AI/ML, HPC clusters, GPU infrastructure).Integrate security best practices into observability platforms, including RBAC, TLS, secret management, and multi-tenant access controls.Collaborate with engineering teams to embed observability into applications, services, and infrastructure.Mentor engineers and influence Crusoe’s observability strategy and technical roadmap.

Oct 1, 2025

Apply

AI Evaluation Engineer

distyl

Full-time|Remote|San Francisco

distyl seeks an AI Evaluation Engineer based in San Francisco. This position centers on assessing artificial intelligence systems, measuring how well models perform, and guiding the process for testing and refining products. Role overview The main focus is to evaluate AI models for accuracy and reliability. The role involves shaping and maintaining testing protocols for both new and existing systems. Collaboration is key, as you will work with teams across the company to help ensure that AI outputs consistently meet quality standards. What you will do Assess AI models to determine their accuracy and reliability Create and update testing protocols for a range of systems Partner with teams throughout the organization to uphold quality benchmarks for AI outputs Requirements Keen attention to detail Interest in artificial intelligence and its real-world uses Comfort working with colleagues from diverse backgrounds

Apr 23, 2026

Apply

Software Engineer, Observability

OpenAI

Full-time|On-site|San Francisco

Become part of the innovative engineering teams at OpenAI, where we create and deliver groundbreaking AI technologies responsibly and safely to the world!Our Applied Engineering team collaborates across research, engineering, product, and design disciplines to deploy OpenAI's cutting-edge technology for both consumers and businesses. We are committed to learning from our deployments and ensuring that AI is utilized ethically while maximizing its benefits. To us, safety takes precedence over unchecked growth.About the RoleWe are in the process of developing OpenAI's observability product, which encompasses everything from scalable infrastructure to an intuitive, AI-enhanced user interface. Our systems process petabytes of logs and billions of time series metrics throughout our infrastructure. We are now integrating intelligence to create features like agents that summarize service events, auto-generate dashboards, and assist engineers in debugging through user-friendly notebook-like interfaces.We are looking to hire software engineers at all levels of our stack—be it infrastructure, backend, or product. You will be part of a dynamic, resourceful team that develops both foundational infrastructure and innovative internal tools, ensuring the reliability, performance, and observability of OpenAI's production systems.What You’ll DoLead the development of core observability infrastructure, focusing on distributed logging, time series, and trace storage.Create AI-integrated tools that empower engineers to autonomously identify, comprehend, and resolve issues.Enhance user interface experiences including dashboards, notebooking, and interactive debugging.Work collaboratively with engineers, researchers, user operations, and various teams to craft the next generation of the observability product.You Might Be a Fit If You:Have experience operating large-scale distributed systems in production, particularly logging systems or time series databases.Excel in ambiguous environments and tackle unscoped challenges head-on.Possess full-stack development skills or a strong product sensibility; you are eager to build practical tools that users will engage with.Demonstrate robust knowledge of systems, networking, and cloud infrastructure (Kubernetes, AWS, etc.).Bonus: Have built or contributed to observability systems (e.g., Prometheus, OpenTelemetry, etc.).Why This Team?We combine infrastructure and product development to create real AI applications for in-house use.Your contributions will directly enhance the reliability of GPT-based products at OpenAI.

Feb 19, 2026

Apply

Software Engineering Manager - Observability

Figma, Inc.

Full-time|On-site|San Francisco, CA • New York, NY • United States

Join Figma as a Software Engineering Manager specializing in Observability. In this pivotal role, you will lead a dynamic team of engineers in developing cutting-edge solutions that enhance visibility and performance across our platform. Your expertise will drive the design and implementation of observability tools that empower our engineering teams to optimize their workflows, ensuring the robustness and reliability of our applications.

Feb 27, 2026

Apply

Staff AI Engineer - AI Platform

ClickUp

Full-time|On-site|United States of America

At ClickUp, we are not just creating software, we are crafting the future of work! In a landscape saturated with work complexities, we envisioned a better solution. This vision led to the development of the first genuinely integrated AI workspace, seamlessly combining tasks, documents, chat, calendar, and enterprise search, all enhanced by context-sensitive AI. This empowers millions of teams to break down barriers, reclaim their time, and elevate productivity to new heights. Here at ClickUp, you will have the chance to learn, leverage, and innovate with AI in ways that will not only transform our product but also redefine the future of work itself. Join us in being part of a daring, innovative team that is reshaping possibilities! Role Overview:We are in search of a highly talented and experienced Staff AI Engineer – AI Platform to become a vital member of our ClickUp Engineering team. In this pivotal role, you will significantly contribute to the development of our core AI platform and directly utilize large language models (LLMs) to implement intelligent features across ClickUp. Your focus will be on back-end systems that facilitate scalable, dependable, and secure AI-driven capabilities, while also engaging hands-on with LLMs to address real user challenges and propel product innovation.Key Responsibilities:Design, architect, and implement scalable AI platform services to support the deployment, orchestration, and lifecycle management of LLMs and other AI models.Utilize LLMs and other AI technologies to build and enhance ClickUp’s intelligent features, collaborating closely with product and engineering teams to deliver impactful solutions.Develop and maintain robust APIs and backend systems that facilitate seamless integration of AI-enhanced features into ClickUp’s core platform.Create infrastructure for model serving, monitoring, logging, and automated evaluation to ensure high reliability and performance of AI services in production.Integrate with various LLM providers (e.g., OpenAI, Anthropic, Google) and manage model selection, routing, and fallback strategies for optimal performance and cost-efficiency.Promote best practices in AI privacy, security, and compliance, including data anonymization and secure data handling.Optimize platform performance, scalability, and cost-efficiency, utilizing cloud-native technologies and distributed systems.Stay abreast of advancements in AI infrastructure, MLOps, and LLM applications, and proactively apply relevant innovations to ClickUp’s AI platform.Collaborate cross-functionally with teams to drive AI initiatives.

Jan 20, 2026

Apply

Staff+ Software Engineer, Observability

Anthropic

Full-time|On-site|San Francisco, CA | New York City, NY | Seattle, WA

Join Anthropic as a Staff+ Software Engineer specializing in Observability, where you will play a crucial role in enhancing our systems to ensure high-performance and reliability. Collaborate with cross-functional teams to develop innovative solutions, implement observability metrics, and drive improvements that enable better decision-making and user experiences.

Mar 12, 2026

Apply

Engineering Manager - AI Cloud Platform

Lambda

Full-time|On-site|San Francisco Office

Lambda, recognized as The Superintelligence Cloud, is a pioneering force in AI cloud infrastructure, empowering tens of thousands of customers, from AI researchers to large enterprises and hyperscalers. Our mission is to make computational power as accessible as electricity, providing everyone the capability of superintelligence—one person, one GPU.Join us in our quest to build the world’s leading AI cloud platform.Note: This role mandates in-office presence in our San Francisco location four days a week; Lambda’s designated remote work day is Tuesday.As an Engineering Manager at Lambda, you will lead the charge in developing and scaling our cloud offerings, which encompass the Lambda website, cloud APIs, and internal tools for deployment, management, and maintenance.

Sep 10, 2025

Apply

Engineering Manager - Developer Observability

Adyen

Full-time|On-site|San Francisco

Join Adyen as an Engineering Manager for our Developer Observability team! In this pivotal role, you will lead a dynamic group of engineers dedicated to enhancing the observability of our developer platforms. You will be responsible for driving technical innovation, mentoring your team, and collaborating closely with cross-functional partners to deliver exceptional developer experiences.As a leader, you will empower your team to excel in building tools and solutions that provide insights into system performance, ensuring our developers have everything they need to thrive. If you are passionate about technology, leadership, and fostering a culture of excellence, we want to hear from you!

Mar 27, 2026

Apply

Machine Learning Engineer - LLM Evaluations and Observability

gleanwork

Full-time|Remote|San Francisco Bay Area

Join gleanwork as a Machine Learning Engineer specializing in LLM evaluations and observability. In this role, you will be instrumental in developing cutting-edge machine learning systems that enhance our understanding and effectiveness of language learning models. You will collaborate with cross-functional teams to drive the integration of advanced analytics and machine learning solutions.

Mar 16, 2026

Apply

Software Engineer, Generative AI Platform

Abridge

Full-time|On-site|SF Office

About AbridgeFounded in 2018, Abridge is dedicated to enhancing understanding in healthcare through our innovative AI-powered platform. We specialize in transforming medical conversations into structured clinical notes in real-time, enabling clinicians to prioritize patient care. Our enterprise-grade technology seamlessly integrates with electronic medical records (EMRs) to ensure accuracy and trust in AI-generated summaries.As pioneers in generative AI for healthcare, we are setting the industry benchmarks for responsible AI deployment across health systems. Our diverse team consists of practicing MDs, AI scientists, PhDs, creatives, technologists, and engineers united in their mission to empower patients and make healthcare more comprehensible. We have offices located in San Francisco's Mission District, New York's SoHo neighborhood, and East Liberty in Pittsburgh.The RoleJoin us as an AI Platform Engineer, where your work will significantly impact the healthcare sector. You will collaborate with a multidisciplinary team of researchers, clinical scientists, and product engineers to design and develop the runtime, orchestration engine, and evaluation platform necessary for agentic orchestration and LLM-driven workflows.What You’ll DoCreate GenAI systems that transform LLMs into composable, reliable tools, utilizing retrieval, tool use, agentic reasoning, and structured outputs.Develop a highly reliable and scalable agent runtime that includes orchestration, shared state and memory, tool-calling interfaces, and scheduling focused on cost, latency, and quality.Build secure, sandboxed environments for agent actions and code, optimizing for cold start, isolation, and observability.Deliver unified interfaces for multiple model sizes and providers; integrate with open tool ecosystems such as MCP-style connectors.Create an evaluation platform for both online and offline assessments, A/B testing, safety checks, and regression gates that enhance agent reliability over time.Collaborate with Research to bring new agent capabilities from prototype to production.What You’ll BringDemonstrated experience in building agent applications with tool-calling, context engineering, and related technologies.Strong problem-solving skills and the ability to work in a fast-paced, collaborative environment.Familiarity with generative AI technologies and their applications in healthcare.

Oct 7, 2025

Apply

Platform Engineer - Generative AI

Uncountable

Full-time|$120K/yr - $160K/yr|On-site|New York, San Francisco, Munich or London

We appreciate your interest in joining the Uncountable Engineering team!About the RoleUncountable is on the lookout for passionate software engineers who specialize in deploying Generative AI within software applications.Our software platform is utilized by scientists at top R&D organizations to efficiently structure and analyze experimental data. The researchers leveraging our platform are tackling significant challenges in materials science, chemistry, biotechnology, and other scientific domains. Our mission is to empower them to overcome these challenges with greater speed and efficiency.In this role, you will harness the latest advancements in large language models (LLMs) to address the everyday obstacles faced by scientists. You will be responsible for developing AI-enhanced search and visualization tools, innovative user experiences utilizing multimodal LLMs, intelligent research assistants, and much more. This is a rare chance to be at the forefront of accelerating scientific innovation through LLMs and generative AI.Your Responsibilities:Design and implement LLM-powered features from start to finish, encompassing both LLM-specific tasks (prompting, latency optimization, behavior tuning/testing) and conventional full-stack development (frontend/backend coding, API design, and database architecture).Create a robust in-house LLM architecture to meet the evolving demands of our product (including API integration, testing, observability, caching, and infrastructure for fine-tuning and inference).At Uncountable, we pride ourselves on a culture of continuous deployment and rapid iteration. You will have the opportunity to release your work frequently to our users and refine it based on their feedback.The features you develop will often represent groundbreaking approaches to software applications, allowing you to take on diverse roles as a designer, researcher, engineer, product manager, and more to craft outstanding products. This position is ideal for individuals eager to quickly enhance their skills and perspectives or those aspiring to start their own ventures in the future.Salary Range: $120K-$160K + Equity

Apr 21, 2025

Apply

Platform Engineer - Generative AI

Uncountable

Full-time|$120K/yr - $160K/yr|On-site|New York, San Francisco, Munich or London

Thank you for your interest in Uncountable Engineering!Position Overview‍Uncountable is on the lookout for passionate software engineers who specialize in the deployment of Generative AI solutions.Our innovative software platform empowers scientists at leading R&D organizations to efficiently structure and analyze experimental data. Users from various disciplines, including materials science, chemistry, and biotechnology, are leveraging our platform to tackle complex problems, and our mission is to expedite their research endeavors.In this role, you will harness the latest advancements in Large Language Models (LLMs) to address real-world challenges faced by scientists daily. Your responsibilities will include developing AI-driven search and visualization tools, next-gen user experiences utilizing multimodal LLMs, intelligent research assistants, and much more. This is a unique chance to contribute to a transformative movement aimed at accelerating scientific discovery through generative AI.Key ResponsibilitiesYou will be tasked with:- Crafting end-to-end LLM-powered features, encompassing LLM-specific tasks (prompting, latency optimization, and behavior tuning/testing) alongside traditional full-stack development responsibilities (frontend/backend coding, API and database design).- Building a robust in-house LLM infrastructure to meet the evolving demands of our products (API integration, testing, observability, caching, infrastructure for fine-tuning and inference, etc.).At Uncountable, we cultivate a strong culture of continuous deployment and rapid iteration cycles. You will frequently ship your work to users and refine it based on their feedback.Many of the features you create will pioneer new and innovative software applications, allowing you to take on varied roles as a designer, researcher, engineer, product manager, and more. This position is ideal for individuals eager to quickly enhance their skills and perspectives or those aspiring to launch their own ventures in the future.Salary Range: $120K-$160K + Equity

Apr 21, 2025

Apply

Principal/Staff Engineer, Platform & Systems

Retell AI

Full-time|On-site|San Francisco Bay Area

ABOUT RETELL AIAt Retell AI, we are revolutionizing the call center experience using cutting-edge voice AI technology. In just 18 months since our inception, thousands of companies have leveraged our AI voice agents to streamline sales, support, and logistics operations that previously required extensive human teams. Supported by prominent investors such as Y Combinator and Alt Capital, we have grown from $5M to an impressive $36M ARR with a dedicated team of 20.Our ambitious vision for 2026 is to create a state-of-the-art customer experience platform where entire contact centers are driven by AI. Rather than relying on basic automation that necessitates constant human oversight, we are developing intelligent AI “workers” to function as frontline agents, QA analysts, and managers—constantly executing, monitoring, and enhancing customer interactions.As we rapidly expand, we seek passionate innovators eager to solve complex technical challenges, move swiftly, and make a meaningful impact at one of the fastest-growing voice AI startups. Join us in building the future!Ranked among the top 50 AI applications in the a16z list: https://tinyurl.com/5853dt2xRanked #4 on Brex's Fast-Growing Software Vendors of 2025: https://www.brex.com/journal/brex-benchmark-december-2025One of the top startups on the Leana I leaderboard: https://leanaileaderboard.com/THE ROLEWe are in search of a Principal/Staff Engineer to spearhead the technical direction of our core platform. This is an individual contributor role designed for someone who excels in uncertainty, acts swiftly, and elevates the standards of those around them.You will engage with various systems, infrastructure, and product surfaces while collaborating closely with engineering teams, product managers, and leadership to scale successful initiatives and innovate for the future.This role is not about merely addressing tickets; you will identify challenges, engineer solutions, and deliver impactful results.KEY RESPONSIBILITIESLead the design and evolution of our core platform and systems architecture.Oversee complex technical projects from inception to production.Make strategic technical decisions that optimize for speed, reliability, and scalability.Collaborate across teams to facilitate knowledge sharing and best practices.

Feb 8, 2026

Apply

Manager of Data & AI Platform Engineering

Stitch Fix, Inc.

Full-time|$146.3K/yr - $195K/yr|Remote|Remote, USA

About Stitch Fix, Inc. Stitch Fix (NASDAQ: SFIX) stands as the premier online personal styling service, dedicated to helping individuals uncover styles that they will adore and that flatter their unique figures. We understand that few experiences are as personal as getting dressed, yet finding clothing that fits well and looks great can be daunting. Stitch Fix addresses this challenge by merging the expertise of skilled stylists with cutting-edge AI and recommendation algorithms. Our unique blend of exclusive and nationally recognized brands caters to each client's distinct preferences and requirements, allowing them to express their personal style effortlessly without spending hours in stores or browsing through countless online options. Founded in 2011 and headquartered in San Francisco, we are transforming the retail landscape.About the RoleWe are seeking a dynamic Manager of Data & AI Platform Engineering to spearhead our team of engineers focused on core data, machine learning, and generative AI platforms. You will play a crucial role in realizing our vision and driving the technical implementation of systems that facilitate AI-driven, data-centric experiences throughout the organization. This will empower richer personalization, enhanced decision-making, intelligent automation, and innovation across the enterprise.You will help refine our technical infrastructure to support next-generation AI applications, including unified signals, adaptive and context-aware models, semantic understanding, retrieval-based intelligence, and advanced machine learning workflows.

Jan 27, 2026

Apply

Senior Observability Engineer

DigitalOcean

Full-time|Remote|San Francisco

Join DigitalOcean as a Senior Observability Engineer, where you will play a critical role in enhancing our monitoring and observability platforms. Your expertise will help us ensure that our systems are performant, reliable, and scalable, providing a seamless experience for our customers.

Mar 10, 2026

Create account — see all 9,610 results