Senior Engineer Inference Data Plane At Digitalocean98 San Francisco jobs in San Francisco – Browse 12,164 openings on RoboApply Jobs

Senior Engineer Inference Data Plane At Digitalocean98 San Francisco jobs in San Francisco

Open roles matching “Senior Engineer Inference Data Plane At Digitalocean98 San Francisco” with location signals for San Francisco. 12,164 active listings on RoboApply Jobs.

12,164 jobs found

1 - 20 of 12,164 Jobs
Apply
companyDigitalOcean, Inc. logo
Full-time|Remote|San Francisco

We are seeking a highly skilled Senior Engineer to join our Inference Data Plane team at DigitalOcean. In this pivotal role, you will be responsible for designing and implementing advanced data processing solutions that facilitate machine learning inference at scale. You will work collaboratively with cross-functional teams to optimize our data infrastructure and ensure reliable performance.

Mar 24, 2026
Apply
companyDigitalOcean logo
Full-time|On-site|San Francisco

Join DigitalOcean as a Senior Engineer 2 specializing in Inference Data Plane, where you'll play a pivotal role in enhancing our data processing capabilities. You will work alongside a talented team to design, implement, and optimize systems that facilitate efficient inference operations. Your expertise will contribute to building scalable solutions that empower developers and organizations to harness the cloud effectively.

Mar 17, 2026
Apply
companyCartesia logo
Full-time|On-site|*HQ - San Francisco, CA

Join Cartesia as an Inference EngineerAt Cartesia, our vision is to create the next evolution of AI: an interactive, omnipresent intelligence that operates seamlessly across all environments. Currently, even the most advanced models struggle to continuously analyze a year's worth of audio, video, and text data—comprising 1 billion text tokens, 10 billion audio tokens, and 1 trillion video tokens—much less perform these tasks on-device.We are at the forefront of developing the model architectures that will make this a reality. Our founding team, who met as PhD candidates at the Stanford AI Lab, pioneered State Space Models (SSMs), a groundbreaking framework for training efficient, large-scale foundation models. Our talented team merges deep expertise in model innovation and systems engineering with a design-focused product engineering approach, enabling us to build and launch state-of-the-art models and user experiences.Supported by leading investors such as Index Ventures and Lightspeed Venture Partners, along with contributions from Factory, Conviction, A Star, General Catalyst, SV Angel, Databricks, and others, we are fortunate to be guided by numerous exceptional advisors and over 90 angel investors from diverse industries, including some of the world’s foremost experts in AI.About the RoleWe are actively seeking an Inference Engineer to propel our mission of creating real-time multimodal intelligence.Your ImpactDevelop and implement a low-latency, scalable, and dependable model inference and serving stack for our innovative foundation models utilizing Transformers, SSMs, and hybrid models.Collaborate closely with our research team and product engineers to efficiently deliver our product suite in a fast, cost-effective, and reliable manner.Construct robust inference infrastructure and monitoring systems for our product offerings.Enjoy substantial autonomy in shaping our products and directly influencing how cutting-edge AI is integrated across diverse devices and applications.What You BringAt Cartesia, we prioritize strong engineering skills due to the complexity and scale of the challenges we tackle.Proficient engineering skills with a comfort level in navigating intricate codebases, and a commitment to producing clean, maintainable code.Experience in developing large-scale distributed systems with strict performance, reliability, and observability requirements.Proven technical leadership, capable of executing and delivering results from zero to one amidst uncertainty.A background in or experience with inference pipelines, machine learning, and generative models.

Dec 12, 2024
Apply
companyCrusoe logo
Full-time|$237.6K/yr - $288K/yr|On-site|San Francisco, CA - US

At Crusoe, our mission is to accelerate the proliferation of energy and intelligence. We're developing the technology that enables ambitious AI-driven creations without compromising on scale, speed, or sustainability.Join us at the forefront of the AI revolution with sustainable technology. Here, you will lead innovative initiatives, make a significant impact, and work with a team that is pioneering responsible and transformative cloud infrastructure.Role Overview:We are on the lookout for a Senior Engineering Manager for Data Plane Systems, who will spearhead the team accountable for high-performance Software Defined Networking (SDN) data planes across hosts and Data Processing Units (DPUs). In this hands-on senior leadership capacity, you will be responsible for the architecture, implementation, and operational management of a hardware-accelerated networking stack tailored for large-scale GPU workloads, with a focus on rapid feature deployment from commit to production.Key Responsibilities:Technical Leadership: Define the strategic roadmap for SDN data plane systems and guide the integration of DPUs (like NVIDIA BlueField) and hardware accelerators.Architecture & Optimization: Oversee the development of Linux kernel networking components, XDP/eBPF data paths, and DPDK-based fast paths while driving the transition of networking functions to hardware offload architectures.Operational Excellence: Lead performance benchmarking, regression prevention, and incident response, ensuring we meet our operational goals within 3-6 month cycles.Team Development: Mentor and cultivate a high-performing team of senior and staff-level systems engineers, setting technical standards and nurturing a culture of accountability.Collaboration: Work closely with control-plane teams (OVN/OVS) to enhance throughput and latency for multi-tenant GPU clusters.

Feb 20, 2026
Apply
companyDigitalOcean, Inc. logo
Full-time|On-site|San Francisco

Join DigitalOcean as a Senior Data Center Engineer II, where you will play a pivotal role in optimizing our data center operations. You will collaborate with a dedicated team of engineers to ensure the efficiency and reliability of our infrastructure.

Apr 6, 2026
Apply
companyPerplexity logo
Full-time|On-site|San Francisco

Join our dynamic team at Perplexity as an AI Inference Engineer, where you will be at the forefront of deploying cutting-edge machine learning models for real-time inference. Our tech stack includes Python, Rust, C++, PyTorch, Triton, CUDA, and Kubernetes, providing you with a chance to work on large-scale applications that make a real impact.Key ResponsibilitiesDesign and develop APIs for AI inference that cater to both internal and external stakeholders.Conduct benchmarking and identify bottlenecks within our inference stack to enhance performance.Ensure the reliability and observability of our systems while promptly addressing any outages.Investigate innovative research and implement optimizations for LLM inference.

Jun 10, 2024
Apply
companyRedpanda Data logo
Full-time|$245K/yr - $290K/yr|On-site|San Francisco, CA

Redpanda Data is building the Agentic Data Plane (ADP), a platform that connects AI agents with enterprise data and systems. The ADP supports real-time, autonomous reasoning and action by agentic applications, powered by Redpanda's multi-modal data streaming engine. Major organizations across industries, including Activision Blizzard, Cisco, Moody's, Texas Instruments, Vodafone, and two of the top five U.S. banks, rely on Redpanda to process hundreds of terabytes of data every day. Backed by investors such as Lightspeed, GV, and Haystack VC, Redpanda operates as a globally distributed, people-first company. Role overview The Principal Software Engineer will architect and develop the Agentic Data Plane, which serves as the control and execution layer for AI agents interacting with enterprise data. This system enables agents to access, analyze, and act on data in real time, while providing human operators with oversight and control for secure operations. The ADP brings together Redpanda's low-latency streaming technology, a distributed query engine for real-time context, a library of over 300 data connectors, and a global policy and observability framework. This framework enforces access controls, records agent actions, and supports replayable audits. What you will do Design and build the core architecture of the Agentic Data Plane, focusing on secure and efficient data interaction for AI agents. Integrate streaming, query, and policy enforcement components to support real-time, autonomous agent operations. Monitor developments in the agentic AI field and translate research into engineering proposals and product strategies. Work closely with Engineering, Product, and Go-To-Market teams, as well as key customers, to shape the direction of the ADP.

Apr 30, 2026
Apply
companyPerplexity logo
Full-time|On-site|San Francisco

About the RoleWe are seeking a talented Inference Engineering Manager to spearhead our AI Inference team at Perplexity. This is a remarkable opportunity to design and expand the infrastructure that drives Perplexity's innovative products and APIs, catering to millions of users with cutting-edge AI capabilities.You will take charge of the technical direction and implementation of our inference systems while cultivating and leading a high-caliber team of inference engineers. Our technology stack encompasses Python, PyTorch, Rust, C++, and Kubernetes. You will play a crucial role in architecting and scaling the large-scale deployment of machine learning models for Perplexity's Comet, Sonar, Search, and Deep Research products.Why Perplexity?Develop state-of-the-art systems that are among the fastest in the industry using leading-edge technology.Engage in high-impact work within a smaller team, enjoying considerable ownership and autonomy.Seize the chance to create infrastructure from the ground up instead of maintaining outdated systems.Work across the entire spectrum: minimizing costs, scaling traffic, and advancing the capabilities of inference.Make a significant impact on the technical roadmap and team culture at a rapidly expanding company.ResponsibilitiesLead and nurture a high-performing team of AI inference engineers.Develop APIs for AI inference utilized by both internal and external clients.Design and scale our inference infrastructure for enhanced reliability and efficiency.Benchmark and resolve bottlenecks across our inference stack.Drive large sparse/MoE model inference at rack scale, including sharding strategies for extensive models.Innovate by developing inference systems that support sparse attention and disaggregated pre-fill/decoding serving.Enhance the reliability and observability of our systems and lead incident response efforts.Make technical decisions regarding batching, throughput, latency, and GPU utilization.Collaborate with ML research teams on model optimization and deployment.Recruit, mentor, and develop engineering talent.Establish team processes, engineering standards, and operational excellence.Qualifications5+ years of engineering experience, with at least 2 years in a technical leadership or management capacity.Proficiency in programming languages and tools such as Python, PyTorch, Rust, and C++.Experience with Kubernetes and cloud infrastructure.Strong understanding of machine learning model deployment and optimization.Exceptional problem-solving and communication skills.

Jan 18, 2026
Apply
companyFluidstack logo
Full-time|$165K/yr - $500K/yr|On-site|San Francisco, CA

Join the Fluidstack TeamAt Fluidstack, we’re pioneering the infrastructure for advanced intelligence. We collaborate with leading AI laboratories, governmental entities, and major corporations—including Mistral, Poolside, and Meta—to deliver computing solutions at unprecedented speeds.Our mission is to transform the vision of Artificial General Intelligence (AGI) into a reality. Driven by our purpose, our dedicated team is committed to building state-of-the-art infrastructure that prioritizes our customers' success. If you share our passion for excellence and are eager to contribute to the future of intelligence, we invite you to be part of our journey.Role OverviewThe Inference Platform team at Fluidstack is at the forefront of addressing the cost and latency challenges associated with frontier AI. You will play a crucial role in managing the serving layer that connects our global accelerator supply with the production workloads of our clients, which include LLM serving frameworks, KV cache infrastructure, and Kubernetes orchestration across multiple data centers.This hands-on individual contributor role combines elements of distributed systems, model optimization, and serving infrastructure. You will oversee the entire lifecycle of inference deployments for leading AI labs, striving for enhancements in throughput, cost-efficiency, and response times, while also influencing the architectural decisions that guide Fluidstack’s deployment strategies.

Mar 5, 2026
Apply
companyBaseten logo
Full-time|On-site|San Francisco

Baseten develops infrastructure and tools that help AI companies deploy and scale inference. Teams at organizations like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma, and Writer rely on Baseten to bring advanced machine learning models into production. The company recently secured a $300M Series E from investors including BOND, IVP, Spark Capital, Greylock, and Conviction. Role overview This Software Engineer - GPU Inference position joins the founding team for Baseten Voice AI in San Francisco. The team focuses on building production-ready Voice AI systems, bringing open-source voice models into real-world use for clients in productivity, customer service, healthcare conversations, and education. The work shapes how people interact with technology through voice, creating broad impact across industries. In this role, the engineer leads the internal inference stack that powers Voice AI models. Responsibilities include guiding the product roadmap and driving engineering execution. Collaboration is a key part of the job, working closely with Forward Deployed Engineers, Model Performance Engineers, and other technical groups to advance Voice AI capabilities. Sample projects and initiatives The world's fastest Whisper, with streaming and diarization Canopy Labs selects Baseten for Orpheus TTS inference Partnering with the Core Product team to build an orchestration framework for a multi-model voice agent Working with the Training Platform team to support continuous training of voice models Designing a developer-friendly API and SDK for self-service adoption of Baseten Voice AI products

Apr 26, 2026
Apply
companySentry logo
Full-time|Remote|San Francisco, California

Join Sentry as a Senior Software Engineer specializing in the Control Plane, where you will play a pivotal role in enhancing our platform's capabilities. You'll collaborate with a talented team of engineers to design, develop, and maintain high-performance systems that empower developers to monitor and fix crashes seamlessly.

Apr 9, 2026
Apply
companyDatabricks logo
Full-time|$142.2K/yr - $204.6K/yr|On-site|San Francisco, California

About This Role Join Databricks as a Software Engineer focused on GenAI inference, where you will play a pivotal role in designing, developing, and enhancing the inference engine that drives our Foundation Model API. Collaborating at the intersection of research and production, you will ensure our large language model (LLM) serving systems are optimized for speed, scalability, and efficiency. Your contributions will span the entire GenAI inference stack, from kernels and runtimes to orchestration and memory management. What You Will Do Participate in the design and implementation of the inference engine, collaborating on a model-serving stack tailored for large-scale LLM inference. Work closely with researchers to integrate new model architectures or features such as sparsity, activation compression, and mixture-of-experts into the engine. Optimize latency, throughput, memory efficiency, and hardware utilization across GPUs and other accelerators. Build and maintain tools for instrumentation, profiling, and tracing to identify bottlenecks and inform optimization efforts. Develop scalable routing, batching, scheduling, memory management, and dynamic loading mechanisms for inference workloads. Ensure reliability, reproducibility, and fault tolerance in inference pipelines, including A/B launches, rollback, and model versioning. Integrate with federated and distributed inference infrastructure, orchestrating across nodes, balancing load, and managing communication overhead. Engage in cross-functional collaboration with platform engineers, cloud infrastructure, and security/compliance teams. Document and share insights, contributing to internal best practices and open-source initiatives as appropriate.

Jan 30, 2026
Apply
companyPoint logo
Full-time|On-site|SF

At Point, we're redefining the banking landscape by transforming the everyday debit card into a gateway for rewards and benefits. Join our innovative team as we embark on this exciting journey!We are seeking a skilled Data Engineer to lead our data initiatives at Point. In this role, you will be responsible for designing and implementing robust data pipelines and systems that will support the development of cutting-edge algorithms and models. You will have full ownership of our data infrastructure while establishing the groundwork for data engineering practices at Point. Collaborating closely with both the engineering and data science teams, you will play a pivotal role in driving product initiatives and achieving our business objectives. This position reports directly to the CTO.Discover more about Point's vibrant culture and dedicated team here.

Feb 13, 2020
Apply
companyDigitalOcean, Inc. logo
Full-time|Remote|San Francisco

Join DigitalOcean as a Senior Engineer focused on Inference Optimizations, where you will play a pivotal role in enhancing our AI and machine learning capabilities. Collaborate with a talented team to develop cutting-edge solutions that optimize inference processes across various applications.

Mar 17, 2026
Apply
companyCrusoe logo
Full-time|On-site|San Francisco, CA - US

About the Role Crusoe is seeking a Software Engineer focused on the Control Plane to help build and improve control systems. This position is based in San Francisco, CA. What You Will Do Develop and refine software that powers Crusoe's control systems Work closely with teammates to solve complex technical challenges Contribute ideas and code to projects that drive the platform forward Who We’re Looking For Experience building or maintaining control plane software Comfort working in a collaborative setting Interest in tackling sophisticated software engineering problems If you enjoy building systems and want to make an impact on Crusoe’s technology, this role offers the chance to work with a dedicated team on meaningful projects.

Apr 16, 2026
Apply
companyFinix logo
Full-time|On-site|San Francisco

Join Our TeamAt Finix, we revolutionize payment processing, allowing businesses of all sizes to move and make money effortlessly. Our full-stack acquiring processor facilitates billions in transactions annually, offering modern, flexible payment solutions. Whether you’re in SaaS, e-commerce, or marketplace sectors, our no-code and low-code tools empower you to accept payments, manage payouts, and onboard merchants in just hours, not months.Having secured over $175M in funding, including a notable $75M Series C led by Acrew Capital, we have strong backing from prominent investors like Lightspeed Venture Partners and Sequoia Capital.Role OverviewAs a Senior Data Engineer, you will be a cornerstone of Finix’s data ecosystem. You will manage our reporting cluster, ensuring data integrity and crafting tools, frameworks, and pipelines that serve as the backbone of our operations. Your expertise will enable our teams to derive actionable insights from complex datasets, facilitating critical business decisions. Collaborating closely with operations, product, and analytics teams, you will ensure that the right data is available to drive timely decisions for both Finix and our customers.

Oct 22, 2025
Apply
companyUnity Technologies logo
Full-time|$172K/yr - $215K/yr|On-site|San Francisco, CA, USA

Join Our TeamAt Unity Technologies, we are on a mission to create a powerful, near real-time reporting platform that is essential for analytics and decision-making across our expansive ecosystem. We are in search of a talented Data Engineer who will play a pivotal role in architecting and implementing distributed data systems that enable this platform to operate at scale.In this dynamic role, you will design and construct high-throughput, low-latency data processing pipelines that support reporting functionalities for both internal teams and external clients. You will work at the forefront of distributed systems, stream processing, and cloud-native infrastructure, ensuring the reliability, correctness, and scalability necessary for a high-volume production setting.This is an impactful position where your engineering excellence, architectural vision, and ownership of production will be highly valued.

Mar 26, 2026
Apply
companyZeta Global Corp. logo
Full-time|$165K/yr - $175K/yr|On-site|San Francisco, California, United States

WHO WE ARE Zeta Global (NYSE: ZETA) is an innovative AI-Powered Marketing Cloud that utilizes cutting-edge artificial intelligence (AI) and vast consumer signals to streamline the process for marketers aiming to efficiently acquire, nurture, and retain customers. Our Zeta Marketing Platform (ZMP) embodies our vision of simplifying sophisticated marketing by consolidating identity, intelligence, and omnichannel activation into one cohesive platform—driven by one of the industry's largest proprietary databases and AI technology. Our enterprise clients, spanning diverse verticals, gain the ability to personalize customer interactions on an individual level across all channels, ultimately enhancing the effectiveness of their marketing initiatives. Established in 2007 by David A. Steinberg and John Sculley, Zeta Global is headquartered in New York City and has a global presence. To discover more, visit www.zetaglobal.com. The Role We are in search of a Senior Data Engineer to architect, develop, and manage the data pipelines and aggregates that fuel Zeta’s AdTech platform. This is a hands-on individual contributor position focused on high-scale batch and streaming data processing, delivering reliable data products and analytics-ready datasets that facilitate predictive analytics, agentic workflows, business intelligence reporting, and measurement. You will work in close collaboration with backend, machine learning, and product teams to provide trustworthy, well-structured data that exhibits exceptional performance, quality, and observability.

Mar 19, 2026
Apply
companyKiddom logo
Full-time|On-site|San Francisco

Join Kiddom as a Senior Data Engineer and play a pivotal role in transforming the educational landscape through innovative technology. Our mission is to enhance learning experiences with an advanced curriculum, AI, and a robust SaaS infrastructure that empowers schools to provide personalized education at scale. Our platform delivers real-time insights and versatile tools that allow educators to prioritize student growth and equity effectively.At Kiddom, we don’t just create technology; we spearhead innovation in an industry ripe for change. Our dedicated team collaborates across engineering, design, research, and education to craft experiences that redefine possibilities for both learners and educators.If you excel in dynamic environments, cherish high-ownership cultures, and are passionate about merging human impact with cutting-edge technology, this is your opportunity to make a meaningful difference.Role Overview:As a Senior Data Engineer, you will be an integral part of our Data Architecture team, responsible for building and enhancing Kiddom’s data systems. You will focus on improving our core architecture for curriculum graphs and analytics while supporting various teams across the organization in utilizing the Kiddom data platform, ensuring student data privacy and security are paramount.Your role will involve close collaboration with departments such as Product, Engineering, and Analytics to understand their data requirements. You will define and document data workflows, pipelines, and transformation processes to ensure clarity and facilitate knowledge sharing.We seek a candidate with exceptional communication skills who can convey complex technical ideas to non-technical stakeholders. A solid understanding of PII compliance and best practices in data handling and storage is essential. If you possess strong problem-solving abilities and a talent for optimizing performance while ensuring data integrity and accuracy, we would love to hear from you!

Aug 6, 2024
Apply
companyPinterest, Inc. logo
Full-time|Remote|San Francisco, CA, US; Remote, US

Join our dynamic team at tvScientific as a Senior Data Engineer. In this role, you will play a pivotal part in shaping our data infrastructure, enhancing our analytics capabilities, and driving data-driven decision-making across the organization. You will collaborate closely with cross-functional teams to implement innovative data solutions that support our strategic objectives.

Mar 4, 2026

Sign in to browse more jobs

Create account — see all 12,164 results

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.