Software Engineer - GPU Inference at Baseten | San Francisco

BasetenSan Francisco

On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.

Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Entry Level

Qualifications

We are looking for individuals who possess a strong background in software engineering, particularly with experience in GPU inference. Candidates should demonstrate a passion for AI technologies, as well as a desire to innovate and contribute to a collaborative team environment. Familiarity with voice recognition systems and open-source projects will be a significant advantage. Strong problem-solving skills and the ability to work effectively across teams are essential.

About the job

Role overview

This Software Engineer - GPU Inference position joins the founding team for Baseten Voice AI in San Francisco. The team focuses on building production-ready Voice AI systems, bringing open-source voice models into real-world use for clients in productivity, customer service, healthcare conversations, and education. The work shapes how people interact with technology through voice, creating broad impact across industries.

In this role, the engineer leads the internal inference stack that powers Voice AI models. Responsibilities include guiding the product roadmap and driving engineering execution. Collaboration is a key part of the job, working closely with Forward Deployed Engineers, Model Performance Engineers, and other technical groups to advance Voice AI capabilities.

Sample projects and initiatives

The world's fastest Whisper, with streaming and diarization
Canopy Labs selects Baseten for Orpheus TTS inference
Partnering with the Core Product team to build an orchestration framework for a multi-model voice agent
Working with the Training Platform team to support continuous training of voice models
Designing a developer-friendly API and SDK for self-service adoption of Baseten Voice AI products

About Baseten

Baseten is a dynamic and rapidly growing company dedicated to advancing AI technology. With a focus on providing mission-critical inference capabilities, we support leading AI companies in deploying their models effectively. Our innovative approach combines cutting-edge research with developer-friendly tools, enabling a transformative impact across various industries.

Similar jobs

1 - 20 of 11,645 Jobs

Search for Data Engineer At Baseten San Francisco

11,645 results

Select all on this page (20)

Apply

Data Engineer at baseten | San Francisco

baseten

Full-time|Remote|San Francisco

Join baseten as a Data Engineer and be at the forefront of data-driven innovation. In this role, you will design and implement robust data pipelines, ensuring the efficient processing and analysis of data to empower our products and decision-making processes. Collaborate with cross-functional teams to understand their data needs, while striving for optimization and scalability in data architectures.

Mar 18, 2026

Apply

Software Engineer - GPU Inference at Baseten | San Francisco

Baseten

Full-time|On-site|San Francisco

Baseten develops infrastructure and tools that help AI companies deploy and scale inference. Teams at organizations like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma, and Writer rely on Baseten to bring advanced machine learning models into production. The company recently secured a $300M Series E from investors including BOND, IVP, Spark Capital, Greylock, and Conviction. Role overview This Software Engineer - GPU Inference position joins the founding team for Baseten Voice AI in San Francisco. The team focuses on building production-ready Voice AI systems, bringing open-source voice models into real-world use for clients in productivity, customer service, healthcare conversations, and education. The work shapes how people interact with technology through voice, creating broad impact across industries. In this role, the engineer leads the internal inference stack that powers Voice AI models. Responsibilities include guiding the product roadmap and driving engineering execution. Collaboration is a key part of the job, working closely with Forward Deployed Engineers, Model Performance Engineers, and other technical groups to advance Voice AI capabilities. Sample projects and initiatives The world's fastest Whisper, with streaming and diarization Canopy Labs selects Baseten for Orpheus TTS inference Partnering with the Core Product team to build an orchestration framework for a multi-model voice agent Working with the Training Platform team to support continuous training of voice models Designing a developer-friendly API and SDK for self-service adoption of Baseten Voice AI products

Apr 26, 2026

Apply

Software Engineer - Realtime Systems at Baseten | San Francisco

Baseten

Full-time|On-site|San Francisco

Baseten supports companies like Cursor, Notion, and Writer in running AI inference at scale. The team blends AI research, adaptive infrastructure, and developer tools to help organizations deploy advanced AI models efficiently. Backed by investors such as BOND, IVP, and Greylock, Baseten recently raised a $300M Series E. The company aims to be the trusted platform for engineers launching AI products. Role overview The Software Engineer - Realtime Systems (Voice AI) role focuses on building and deploying production-ready Voice AI systems. Baseten’s Voice AI team works with open-source models to power applications in productivity, customer support, clinical conversations, creative tools, and education. Engineers in this group influence how people use voice to interact with technology, shaping products that impact multiple industries. This position involves leading Voice AI projects, setting both product direction and technical strategy. Collaboration is a key part of the work: expect to partner with Forward Deployed Engineers, Model Performance Engineers, and other teams to advance Baseten’s Voice AI capabilities. Sample projects The world's fastest Whisper, with streaming and diarization Orpheus TTS inference partnership with Canopy Labs Collaborate with the Core Product team to build a multi-model voice agent using Baseten’s orchestration framework Work alongside the Training Platform team to support ongoing training of voice models Design APIs and SDKs that make Baseten Voice AI products accessible for developers Location This role is based in San Francisco.

Apr 26, 2026

Apply

Onboarding Program Manager at baseten | San Francisco

Baseten

Full-time|On-site|San Francisco

Join Baseten as an Onboarding Program Manager where you will play a vital role in shaping the onboarding experience for our new team members. You will be responsible for developing and implementing effective onboarding programs that enhance employee engagement and retention.

Mar 4, 2026

Apply

Integrated Marketing Manager at Baseten | San Francisco

Baseten

Full-time|On-site|San Francisco

About Baseten Baseten supports leading AI companies, including Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma, and Writer, by delivering essential inference capabilities. The platform brings together advanced AI research, flexible infrastructure, and developer-friendly tools, helping teams move models from the lab into production. Backed by a recent $300M Series E funding round and investors such as BOND, IVP, Spark Capital, Greylock, and Conviction, Baseten is growing quickly in its mission to become the platform engineers trust for building and shipping AI products. Role Overview The Integrated Marketing Manager will shape and run multi-channel marketing campaigns to drive a qualified pipeline and strengthen Baseten’s go-to-market approach. This role calls for a strategic marketer with hands-on experience in AI, comfortable guiding campaigns from initial idea through launch and measurement, and collaborating across teams and channels. What You Will Do Develop and execute full-funnel campaign programs that include content, paid media, email outreach, events, and web initiatives Increase awareness, engagement, and pipeline growth as Baseten scales through FY’27 Work closely with cross-functional teams to ensure campaigns align with business goals and market needs Analyze campaign performance and apply insights to improve future efforts Location This position is based in San Francisco.

Apr 17, 2026

Apply

Account Executive, Industries - Baseten

Baseten

Full-time|On-site|San Francisco

Join Baseten as an Account Executive in the Industries division, where you'll play a pivotal role in driving growth and building strong client relationships. In this position, you will leverage your expertise to engage with prospective customers, understand their needs, and offer tailored solutions that align with their objectives. Ideal candidates will possess exceptional communication skills, a strong sales acumen, and a passion for technology.

Mar 20, 2026

Apply

Dynamic Office Manager Opportunity in San Francisco

baseten

Full-time|On-site|San Francisco

Join baseten as an Office Manager in the vibrant city of San Francisco! In this pivotal role, you will oversee daily office operations, ensuring a smooth and productive work environment. You will be the point of contact for employees, handling administrative tasks, coordinating office events, and providing support to management. Your organizational skills and proactive approach will help us create a welcoming and efficient workspace.

Mar 2, 2026

Apply

Strategic Finance Specialist - GTM at baseten | San Francisco

Baseten

Full-time|Remote|San Francisco

Baseten is looking for a Strategic Finance Specialist to focus on Go-To-Market (GTM) strategies in San Francisco. This position centers on analyzing financial data and supporting key decisions that shape GTM initiatives. Role overview This role works closely with teams across the company to improve growth and efficiency. The Strategic Finance Specialist uses financial insights to guide GTM projects and influence business outcomes. What you will do Analyze financial data related to GTM strategies Support strategic decision-making with clear financial insights Collaborate with cross-functional teams to enhance GTM initiatives Help drive growth and operational efficiency through financial expertise Location This position is based in San Francisco.

Apr 28, 2026

Apply

Security Engineer

Baseten

Full-time|On-site|San Francisco

Join Baseten as a Security Engineer, where you will play a vital role in safeguarding our systems and data. You will be responsible for implementing security measures, conducting audits, and ensuring compliance with industry standards. This position offers the opportunity to work with cutting-edge technology in a collaborative environment.

Apr 1, 2026

Apply

IT Support and Operations Engineer

Baseten

Full-time|Remote|San Francisco

Join Baseten as an IT Support and Operations Engineer where you will play a pivotal role in ensuring the seamless operation of our IT infrastructure. You will be responsible for providing technical support, managing system operations, and collaborating with cross-functional teams to maintain high service standards.

Apr 3, 2026

Apply

Software Engineer - Internal Platform

Baseten

Full-time|On-site|San Francisco

ABOUT BASETENAt Baseten, we empower cutting-edge AI companies, including Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma, and Writer, to achieve mission-critical inference. By merging advanced AI research with flexible infrastructure and intuitive developer tools, we facilitate the deployment of innovative AI models into production. Having recently secured a $300M Series E funding round from esteemed investors like BOND, IVP, Spark Capital, Greylock, and Conviction, we are on a rapid growth trajectory. Join our team and contribute to building a platform that engineers rely on to launch AI products successfully.THE ROLEAs a key member of Baseten's Platform Team, you will play a crucial role in developing internal infrastructure to support our engineering division. While our product offers infrastructure for AI advancements, your primary focus will be on crafting robust internal systems that enhance productivity, collaboration, and work quality across engineering teams, leveraging exceptional tools, efficient workflows, and resilient development settings.If you have a passion for elegant solutions—such as streamlined monorepos, rapid CI pipelines, and well-designed shared libraries—you will excel at Baseten.RESPONSIBILITIESDevelop a range of tools customized to meet the diverse needs of engineering teams.Enhance monorepo functionality and create project templates to ensure consistency and efficiency.Design and implement shared libraries focused on system observability.Optimize the speed, reliability, and thoroughness of our CI pipelines.Assist in designing and maintaining Terraform modules for effective infrastructure management.Provide innovative solutions to improve visibility within continuous delivery (CD) processes.Proactively support engineering teams, ensuring they have the necessary resources and tools for maximum productivity.REQUIREMENTSProficiency in Go and/or Python programming languages.Experience with Kubernetes and Docker tools (e.g., Helm, Docker, Kubernetes).Demonstrated experience managing and working with large monorepos.Strong problem-solving skills with an emphasis on efficient software delivery.Familiarity with CI/CD methodologies and tools.Excellent communication and collaboration skills.

Mar 26, 2025

Apply

Software Engineer - Voice AI

Baseten

Full-time|On-site|San Francisco

Baseten creates AI inference solutions for clients such as Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma, and Writer. The team blends AI research, infrastructure, and developer tools to help organizations deploy advanced models. Backed by $300M in Series E funding from BOND, IVP, Spark Capital, Greylock, and Conviction, Baseten is expanding quickly and shaping the landscape for engineers building AI products. Role overview The Software Engineer - Voice AI role centers on building and deploying open-source voice models for real-world use. Voice is becoming a key interface across the web, and this position addresses the technical challenges of bringing production-ready Voice AI to market. The work supports applications in productivity, customer service, clinical dialogue, creator tools, education, and more, helping to change how people interact with technology across sectors. This engineer leads Baseten’s Voice AI efforts, guiding the proprietary inference stack that powers Voice AI models. The role balances shaping the product roadmap with hands-on engineering. Collaboration is a core part of the job, working closely with Forward Deployed Engineers, Model Performance Engineers, and other technical teams to advance Voice AI capabilities. Sample projects and initiatives The world's fastest Whisper, with streaming and diarization Canopy Labs selects Baseten for Orpheus TTS inference Partnering with the Core Product team to build an orchestration framework for a multi-model voice agent Working with the Training Platform team to support ongoing training of voice models Designing a developer-friendly API and SDK to encourage self-service adoption of Baseten Voice AI products Location San Francisco

Apr 26, 2026

Apply

Software Engineer for Innovative Product Development

Baseten

Full-time|On-site|San Francisco

Join Baseten as a Software Engineer focused on developing cutting-edge products that push the boundaries of technology. In this role, you will collaborate with a dynamic team to design, implement, and maintain innovative software solutions that meet the needs of our users. You will have the opportunity to work on exciting projects that utilize the latest technologies and methodologies.

Feb 24, 2026

Apply

Software Engineer - Billing and Internal Tooling

Baseten

Full-time|On-site|San Francisco

Join Baseten as a Software Engineer focused on Billing and Internal Tooling, where you will play a crucial role in developing and enhancing our internal systems. You will collaborate with cross-functional teams to create efficient billing solutions and streamline internal processes. Your contributions will directly impact our operational efficiency and customer satisfaction.

Feb 27, 2026

Apply

Forward Deployed Engineer

Baseten

Full-time|$300K/yr - $300K/yr|On-site|San Francisco

ABOUT BASETENAt Baseten, we empower revolutionary AI companies, including Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma, and Writer, to achieve mission-critical inference. By integrating cutting-edge AI research, adaptable infrastructure, and user-friendly developer tools, we enable organizations at the forefront of AI to seamlessly deploy advanced models into production. Our rapid growth is exemplified by our recent $300M Series E funding, supported by esteemed investors such as BOND, IVP, Spark Capital, Greylock, and Conviction. Join us in creating the essential platform that engineers rely on to launch AI products successfully.THE ROLEAs a Forward Deployed Engineer at Baseten, you will collaborate closely with clients to design, develop, and implement high-scale production AI applications utilizing our platform. You will guide customers through their journey from initial concept to full production deployment, expertly translating vague business objectives into dependable, observable services that meet clear standards for quality, latency, and cost.This position is ideal for innovative engineers eager to gain insights into how modern enterprises adopt AI at scale, and who thrive in environments that blend product development, software engineering, performance optimization, and direct customer engagement.Please note that this is a hands-on engineering role involving coding and software development, complemented by responsibilities in product management, technical customer success, and pre-sales solution engineering.EXAMPLE INITIATIVESExplore these insightful blog posts authored by our Forward Deployed Engineering team: Forward Deployed Engineering on the Frontier of AIThe Fastest, Most Accurate Whisper TranscriptionDeploy Production-Ready Model Servers from Docker ImagesDeploy Custom ComfyUI Workflows as APIs

Mar 28, 2024

Apply

AI Solutions Engineer

Baseten

Full-time|$300K/yr - $300K/yr|On-site|San Francisco

ABOUT BASETENBaseten is at the forefront of AI innovation, empowering top-tier companies like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma, and Writer to achieve mission-critical AI implementations. By integrating cutting-edge AI research, adaptable infrastructure, and streamlined developer tools, we help organizations harness the power of advanced models and bring them into production. As we continue to expand rapidly, having recently secured our $300M Series E funding round from esteemed investors such as BOND, IVP, Spark Capital, Greylock, and Conviction, we invite you to join our team and contribute to the platform that engineers rely on to deliver innovative AI products.THE ROLEAs an AI Solutions Engineer at Baseten, you will collaborate closely with clients to design, develop, and implement high-performance production AI applications utilizing Baseten's platform. You will guide customers through the entire process, from initial exploration to successful deployment, effectively translating complex business objectives into robust, observable services that deliver clear metrics on quality, latency, and cost.This position is ideal for proactive engineers eager to gain insights into how modern enterprises scale their AI adoption. You will thrive in a multidisciplinary environment, working across product development, software engineering, performance optimization, and direct customer engagement.It’s important to note that this is a hands-on engineering position that involves coding and software development, while also encompassing elements of product management, technical customer success, and pre-sales solution engineering.EXAMPLE INITIATIVESExplore the innovative projects undertaken by our Forward Deployed Engineering team:Forward Deployed Engineering on the Frontier of AIThe Fastest, Most Accurate Whisper TranscriptionDeploy Production-Ready Model Servers from Docker ImagesDeploy Custom ComfyUI Workflows as APIs

Nov 4, 2025

Apply

Software Engineer - Core Product

Baseten

Full-time|$300K/yr - $300K/yr|On-site|San Francisco

ABOUT BASETENAt Baseten, we empower innovative AI companies like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma, and Writer to execute mission-critical inference with ease. By merging advanced AI research with flexible infrastructure and robust developer tools, we enable organizations at the forefront of AI to seamlessly deploy cutting-edge models into production. Fueled by rapid growth and a recent $300M Series E investment from industry leaders such as BOND, IVP, Spark Capital, Greylock, and Conviction, we're building the essential platform that engineers trust to launch AI products.THE ROLEAs a Software Engineer on our Core Product team, you will play a pivotal role in developing and enhancing the core Baseten platform, empowering users to effortlessly deploy and derive value from machine learning models. Given our developer-centric approach, you will engage with a vast array of components, including CLI tools, REST APIs, and the web application. The Core Product team leads all new product innovations within Baseten.EXAMPLE INITIATIVESAs part of our Core Product team, you will tackle exciting projects such as:Chains for multi-component workflowsAsynchronous inferenceModel APIs for cutting-edge modelsModel training optimized for production inferenceRESPONSIBILITIESDevelop and implement new features and products for the teamDesign intuitive APIs and abstractions to effectively address customer needsQuickly resolve bugs and customer issues with a proactive approachWork across the technology stack; you'll engage with both React Components and Kubernetes PodsCollaborate closely with product managers and cross-functional teams to drive product success

Jul 9, 2024

Apply

Software Engineer - Model API's

Baseten

Full-time|On-site|San Francisco

ABOUT BASETENAt Baseten, we are at the forefront of AI innovation, providing critical inference solutions for leading AI companies like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma, and Writer. Our platform combines advanced AI research, adaptable infrastructure, and intuitive developer tools, empowering organizations to deploy state-of-the-art models effectively. With rapid growth and a recent $300M Series E funding round backed by top-tier investors including BOND, IVP, Spark Capital, Greylock, and Conviction, we invite you to join our mission in building the platform of choice for engineers delivering AI products.THE ROLE:As a member of Baseten’s Model Performance (MP) team, you will play a pivotal role in ensuring our platform’s model APIs are not only fast and reliable but also cost-effective. Your primary focus will be on developing and optimizing the infrastructure that supports our hosted API endpoints for cutting-edge open-source models. This role involves working with distributed systems, model serving, and enhancing the developer experience. You will collaborate with a small, dynamic team at the intersection of product development, model performance, and infrastructure, defining how developers interact with AI models on a large scale.RESPONSIBILITIES:Design, develop, and maintain the Model APIs surface, focusing on advanced inference features such as structured outputs (JSON mode, grammar-constrained generation), tool/function calling, and multi-modal serving.Profile and optimize TensorRT-LLM kernels, analyze CUDA kernel performance, create custom CUDA operators, and enhance memory allocation patterns for maximum efficiency across multi-GPU setups.Implement performance improvements across various runtimes based on a deep understanding of their internals, including speculative decoding, guided generation for structured outputs, and custom scheduling algorithms for high-performance serving.Develop robust benchmarking frameworks to evaluate real-world performance across diverse model architectures, batch sizes, sequence lengths, and hardware configurations.Enhance performance across runtimes (e.g., TensorRT, TensorRT-LLM) through techniques such as speculative decoding, quantization, batching, and KV-cache reuse.Integrate deep observability mechanisms (metrics, traces, logs) and establish repeatable benchmarks to assess speed, reliability, and quality.

Oct 11, 2025

Apply

Software Engineer - AI Enablement

Baseten

Full-time|On-site|San Francisco

Join our innovative team at Baseten as a Software Engineer - AI Enablement. In this role, you will work on cutting-edge AI technologies and help build tools that empower developers to harness the full potential of artificial intelligence.We are looking for passionate engineers who thrive in a collaborative environment and are eager to tackle challenging problems. You will be responsible for designing and implementing scalable AI solutions, working closely with cross-functional teams to deliver impactful results.

Feb 24, 2026

Apply

Software Engineer - Developer Ecosystem

Baseten

Full-time|On-site|San Francisco

Join Baseten as a Software Engineer focused on enhancing our Developer Ecosystem. You will be instrumental in crafting solutions that enable developers to build, train, and deploy machine learning models seamlessly. Your role will involve collaborating with cross-functional teams to innovate and optimize our platform, ensuring it meets the needs of our users.

Mar 20, 2026

Create account — see all 11,645 results