Software Engineer - Model API's

BasetenSan Francisco

On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.

Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

Qualifications:Proven experience in software engineering, particularly in API development and optimization. Strong understanding of distributed systems, model serving, and performance optimization techniques. Proficiency in CUDA programming and experience with TensorRT or similar technologies. Familiarity with benchmarking methodologies and performance analysis tools. Excellent problem-solving skills and the ability to work effectively in a collaborative team environment. Strong communication skills to articulate complex concepts to technical and non-technical stakeholders.

About the job

ABOUT BASETEN

At Baseten, we are at the forefront of AI innovation, providing critical inference solutions for leading AI companies like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma, and Writer. Our platform combines advanced AI research, adaptable infrastructure, and intuitive developer tools, empowering organizations to deploy state-of-the-art models effectively. With rapid growth and a recent $300M Series E funding round backed by top-tier investors including BOND, IVP, Spark Capital, Greylock, and Conviction, we invite you to join our mission in building the platform of choice for engineers delivering AI products.

THE ROLE:

As a member of Baseten’s Model Performance (MP) team, you will play a pivotal role in ensuring our platform’s model APIs are not only fast and reliable but also cost-effective. Your primary focus will be on developing and optimizing the infrastructure that supports our hosted API endpoints for cutting-edge open-source models. This role involves working with distributed systems, model serving, and enhancing the developer experience. You will collaborate with a small, dynamic team at the intersection of product development, model performance, and infrastructure, defining how developers interact with AI models on a large scale.

RESPONSIBILITIES:

Design, develop, and maintain the Model APIs surface, focusing on advanced inference features such as structured outputs (JSON mode, grammar-constrained generation), tool/function calling, and multi-modal serving.
Profile and optimize TensorRT-LLM kernels, analyze CUDA kernel performance, create custom CUDA operators, and enhance memory allocation patterns for maximum efficiency across multi-GPU setups.
Implement performance improvements across various runtimes based on a deep understanding of their internals, including speculative decoding, guided generation for structured outputs, and custom scheduling algorithms for high-performance serving.
Develop robust benchmarking frameworks to evaluate real-world performance across diverse model architectures, batch sizes, sequence lengths, and hardware configurations.
Enhance performance across runtimes (e.g., TensorRT, TensorRT-LLM) through techniques such as speculative decoding, quantization, batching, and KV-cache reuse.
Integrate deep observability mechanisms (metrics, traces, logs) and establish repeatable benchmarks to assess speed, reliability, and quality.

About Baseten

Baseten is a pioneering platform that empowers the world's most innovative AI companies to leverage advanced inference solutions. With a commitment to enhancing the developer experience and optimizing model performance, we are rapidly growing and attracting significant investment to further our mission. Join us as we revolutionize the AI landscape by providing the tools that developers trust to bring their AI products to market effectively.

Similar jobs

1 - 20 of 5,877 Jobs

Search for Staff Software Engineer Model Serving

5,877 results

Select all on this page (20)

Apply

Staff Software Engineer, Model Serving

Databricks

Full-time|$192K/yr - $260K/yr|On-site|San Francisco, California

At Databricks, we are dedicated to empowering data teams to tackle the most challenging problems in the world — from realizing the future of transportation to fast-tracking medical innovations. We accomplish this by developing and operating the premier data and AI infrastructure platform, enabling our customers to harness profound data insights for business enhancement. Our Model Serving product equips organizations with a cohesive, scalable, and governed solution for deploying and managing AI/ML models — ranging from traditional machine learning to intricate proprietary large language models. It ensures real-time, low-latency inference, governance, monitoring, and lineage. As the adoption of AI surges, Model Serving stands as a fundamental component of the Databricks platform, allowing customers to operationalize models at scale with robust SLAs and cost efficiency. In the role of Staff Engineer, you will significantly influence both the product experience and the core infrastructure of Model Serving. Your responsibilities will include designing and constructing systems that facilitate high-throughput, low-latency inference across CPU and GPU workloads, steering architectural strategies, and collaborating extensively with platform, product, infrastructure, and research teams to create an exceptional serving platform.

Jan 30, 2026

Apply

Staff Software Engineer, Foundation Model Serving

Databricks

Full-time|$192K/yr - $260K/yr|On-site|San Francisco, California

At Databricks, we are driven by our commitment to empower data teams in tackling the world's most challenging problems — from transforming transportation solutions to accelerating medical advancements. Our mission revolves around constructing and maintaining the world's premier data and AI infrastructure platform, enabling our clients to harness deep data insights for enhanced business outcomes.Foundation Model Serving represents the API product designed for hosting and serving advanced AI model inference, catering to both open-source models like Llama, Qwen, and GPT OSS, as well as proprietary models such as Claude and OpenAI GPT. We welcome engineers who have experience managing high-scale operational systems, including customer-facing APIs, Edge Gateways, or ML Inference services, even if they do not have a background in ML or AI. A passion for developing LLM APIs and runtimes at scale is essential.As a Staff Engineer, you will play a pivotal role in defining both the product experience and the underlying infrastructure. You will be tasked with designing and building systems that facilitate high-throughput, low-latency inference on GPU workloads with cutting-edge models. Your influence will extend to architectural direction, working closely with platform, product, infrastructure, and research teams to deliver an exceptional foundation model API product.The impact you will have:Design and implement core systems and APIs that drive Databricks Foundation Model Serving, ensuring scalability, reliability, and operational excellence.Collaborate with product and engineering leaders to outline the technical roadmap and long-term architecture for workload serving.Make architectural decisions to enhance performance, throughput, autoscaling, and operational efficiency for GPU serving workloads.Contribute directly to critical components within the serving infrastructure, from systems like vLLM and SGLang to developing token-based rate limiters and optimizers, ensuring seamless and efficient operations at scale.Work cross-functionally with product, platform, and research teams to transform customer requirements into dependable and high-performing systems.Establish best practices for code quality, testing, and operational readiness while mentoring fellow engineers through design reviews and technical support.Represent the team in inter-departmental technical discussions, influencing Databricks’ wider AI platform strategy.

Jan 30, 2026

Apply

Senior Software Engineer, Model Serving

Databricks

Full-time|$166K/yr - $225K/yr|On-site|San Francisco, California

At Databricks, we are dedicated to empowering data teams to tackle some of the most challenging issues of our time—from realizing the future of transportation to speeding up medical innovations. We achieve this by developing and maintaining the premier data and AI infrastructure platform, allowing our clients to leverage profound data insights to enhance their operations. Our Model Serving product equips organizations with a cohesive, scalable, and governed platform for deploying and overseeing AI/ML models, spanning traditional ML to specialized large language models. It provides real-time, low-latency inference, governance, monitoring, and lineage capabilities. With the rapid rise of AI adoption, Model Serving stands as a fundamental component of the Databricks platform, enabling clients to operationalize models efficiently and cost-effectively at scale. As a Senior Engineer, your role will be pivotal in transforming both the product experience and the underlying infrastructure of Model Serving. You will design and create systems enabling high-throughput, low-latency inference across CPU and GPU workloads, influence architectural strategies, and work closely with platform, product, infrastructure, and research teams to deliver an exceptional serving platform.

Jan 30, 2026

Apply

Senior Manager, Engineering - Model Serving

Databricks

Full-time|$217K/yr - $312.2K/yr|On-site|San Francisco, California

At Databricks, we are dedicated to empowering data teams to tackle the most challenging global issues—whether it's transforming transportation or speeding up medical advancements. We achieve this by constructing and managing the world's leading data and AI infrastructure platform, enabling our clients to leverage deep data insights for business enhancement. The Model Serving product at Databricks offers enterprises a cohesive, scalable, and governed platform for deploying and managing AI/ML models—from conventional ML to sophisticated, proprietary large language models. It facilitates real-time, low-latency inference while providing governance, monitoring, and lineage capabilities. As AI adoption surges, Model Serving becomes a central component of the Databricks platform, allowing customers to operationalize models efficiently and cost-effectively. As a Senior Engineering Manager, you will lead a team responsible for both the product experience and the underlying infrastructure of Model Serving. This role involves shaping user-facing features while architecting for scalability, extensibility, and performance across CPU and GPU inference. You will collaborate closely with various teams across the platform, product, infrastructure, and research domains.

Feb 1, 2026

Apply

Senior Software Engineer - Self-Serve Intelligence

Lyft, Inc.

Full-time|$185K/yr - $222K/yr|On-site|San Francisco, CA

Lyft’s Self-Serve Intelligence team builds the systems that help riders and drivers resolve issues on their own. Part of the Safety & Customer Care organization, this group focuses on backend services, APIs, and AI-powered products that let customers get help without waiting for an agent. The team’s work includes AI Assist (such as AI Agents), automations, and self-service workflows, all designed to make support fast and reliable. Role overview As a Senior Software Engineer on this team, the main responsibility is to design, build, deploy, and maintain backend systems and AI-driven tools that handle customer problems automatically. These solutions use Generative AI and automation to deliver scalable, dependable self-service experiences for millions of Lyft riders and drivers. What you will do Design and develop backend services and APIs for AI-powered self-service products Build and maintain AI Agents and automation tools that resolve customer issues without agent involvement Oversee the full development lifecycle: system design, prototyping, deployment, and ongoing operations Work closely with product managers, designers, data scientists, and operations teams to deliver robust solutions Focus on reliability, scalability, and operational excellence in all systems Location This role is based in San Francisco, CA.

Apr 17, 2026

Apply

Model Engineer - Technical Staff Member

Meter

Full-time|On-site|San Francisco

Join Us in Revolutionizing AI InfrastructureAt Meter, we are pioneering the application of cutting-edge AI technology to transform the way the internet is constructed, monitored, and managed.Our vertical integration encompasses the entire enterprise networking stack: from hardware and firmware to operating systems and operations. This unique position offers us comprehensive visibility and control over the entire stack via a singular API, along with a proprietary dataset that is unmatched in the industry, paving the way for complete end-to-end automation. Our solutions are already in use by Fortune 500 companies, educational institutions, manufacturing facilities, and cloud-scale clients.We are in the process of assembling a founding core engineering team dedicated to developing and training models that can comprehend these systems, enhance operational efficiency, predict failures, and resolve issues proactively. In essence, you will be instrumental in creating the decision-making framework that underpins the infrastructure of the modern world.You will collaborate closely with our founders, playing a key role in shaping the future of one of the most impactful applications of models available today.Learn more about us at meter.ai.

Jul 23, 2025

Apply

Senior AI Infrastructure Engineer - Model Serving Platform

Scale AI

Full-time|$216.2K/yr - $270.3K/yr|On-site|San Francisco, CA; New York, NY

Join our dynamic Machine Learning Infrastructure team as a Senior AI Infrastructure Engineer, where you will play a pivotal role in designing and constructing platforms that ensure the scalable, reliable, and efficient serving of Large Language Models (LLMs). Our innovative platform supports a range of cutting-edge research and production systems, catering to both internal and external applications across diverse environments.The ideal candidate will possess a solid foundation in machine learning principles coupled with extensive experience in backend system architecture. You will thrive in a collaborative environment that bridges research and engineering, working diligently to provide seamless experiences for our customers and accelerating innovation across the organization.

Mar 26, 2026

Apply

Software Engineer - Model API's

Baseten

Full-time|On-site|San Francisco

ABOUT BASETENAt Baseten, we are at the forefront of AI innovation, providing critical inference solutions for leading AI companies like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma, and Writer. Our platform combines advanced AI research, adaptable infrastructure, and intuitive developer tools, empowering organizations to deploy state-of-the-art models effectively. With rapid growth and a recent $300M Series E funding round backed by top-tier investors including BOND, IVP, Spark Capital, Greylock, and Conviction, we invite you to join our mission in building the platform of choice for engineers delivering AI products.THE ROLE:As a member of Baseten’s Model Performance (MP) team, you will play a pivotal role in ensuring our platform’s model APIs are not only fast and reliable but also cost-effective. Your primary focus will be on developing and optimizing the infrastructure that supports our hosted API endpoints for cutting-edge open-source models. This role involves working with distributed systems, model serving, and enhancing the developer experience. You will collaborate with a small, dynamic team at the intersection of product development, model performance, and infrastructure, defining how developers interact with AI models on a large scale.RESPONSIBILITIES:Design, develop, and maintain the Model APIs surface, focusing on advanced inference features such as structured outputs (JSON mode, grammar-constrained generation), tool/function calling, and multi-modal serving.Profile and optimize TensorRT-LLM kernels, analyze CUDA kernel performance, create custom CUDA operators, and enhance memory allocation patterns for maximum efficiency across multi-GPU setups.Implement performance improvements across various runtimes based on a deep understanding of their internals, including speculative decoding, guided generation for structured outputs, and custom scheduling algorithms for high-performance serving.Develop robust benchmarking frameworks to evaluate real-world performance across diverse model architectures, batch sizes, sequence lengths, and hardware configurations.Enhance performance across runtimes (e.g., TensorRT, TensorRT-LLM) through techniques such as speculative decoding, quantization, batching, and KV-cache reuse.Integrate deep observability mechanisms (metrics, traces, logs) and establish repeatable benchmarks to assess speed, reliability, and quality.

Oct 11, 2025

Apply

Software Engineer, Model Inference

OpenAI

Full-time|On-site|San Francisco

About Our TeamJoin the Inference team at OpenAI, where we leverage cutting-edge research and technology to deliver exceptional AI products to consumers, enterprises, and developers. Our mission is to empower users to harness the full potential of our advanced AI models, enabling unprecedented capabilities. We prioritize efficient and high-performance model inference while accelerating research advancements.About the RoleWe are seeking a passionate Software Engineer to optimize some of the world's largest and most sophisticated AI models for deployment in high-volume, low-latency, and highly available production and research environments.Key ResponsibilitiesCollaborate with machine learning researchers, engineers, and product managers to transition our latest technologies into production.Work closely with researchers to enable advanced research initiatives through innovative engineering solutions.Implement new techniques, tools, and architectures that enhance the performance, latency, throughput, and effectiveness of our model inference stack.Develop tools to identify bottlenecks and instability sources, designing and implementing solutions for priority issues.Optimize our code and Azure VM fleet to maximize every FLOP and GB of GPU RAM available.You Will Excel in This Role If You:Possess a solid understanding of modern machine learning architectures and an intuitive grasp of performance optimization strategies, especially for inference.Take ownership of problems end-to-end, demonstrating a willingness to acquire any necessary knowledge to achieve results.Bring at least 5 years of professional software engineering experience.Have or can quickly develop expertise in PyTorch, NVidia GPUs, and relevant optimization software stacks (such as NCCL, CUDA), along with HPC technologies like InfiniBand, MPI, and NVLink.Have experience in architecting, building, monitoring, and debugging production distributed systems, with bonus points for working on performance-critical systems.Have successfully rebuilt or significantly refactored production systems multiple times to accommodate rapid scaling.Are self-driven, enjoying the challenge of identifying and addressing the most critical problems.

Feb 6, 2025

Apply

Lead Software Engineer for Model Serving Platform

Sciforium

Full-time|On-site|San Francisco

At Sciforium, we're at the forefront of AI infrastructure innovation, dedicated to developing cutting-edge multimodal AI models and a proprietary, high-efficiency model serving platform. With significant multi-million-dollar backing and direct collaboration from AMD, including hands-on support from AMD engineers, our team is rapidly expanding to construct the comprehensive stack that fuels leading-edge AI models and real-time applications.About the RoleJoin us in a unique opportunity to architect and spearhead the development of Sciforium's next-generation model serving platform, the powerhouse that will deliver a multimodal, high-performance foundation model to market. As a senior technical leader, you will not only craft core components but also mentor and guide fellow engineers, shaping engineering direction, standards, and quality of execution.You'll delve into the entire AI stack: from GPU kernels and quantized execution paths to distributed serving, scheduling, and the APIs that drive real-time AI applications. If you relish deep systems work, thrive on ownership, and aspire to lead engineers in constructing foundational AI infrastructure, this role places you at the heart of Sciforium's mission and growth.Your ResponsibilitiesSteer the technical direction of the model serving platform, overseeing architectural decisions and engineering execution.Develop core serving components such as execution runtimes, batching, scheduling, and distributed inference systems.Create high-performance C++ and CUDA/HIP modules, including custom GPU kernels and memory-optimized runtimes.Collaborate with ML researchers to transition new multimodal models into production while ensuring low-latency, scalable inference.Construct Python APIs and services that make model capabilities accessible to downstream applications.Mentor and assist other engineers through code reviews, design discussions, and direct technical support.Lead performance profiling, benchmarking, and observability initiatives across the inference stack.Guarantee high reliability and maintainability through rigorous testing, monitoring, and adherence to engineering best practices.Diagnose and resolve intricate issues spanning GPU, runtime, and service layers.

Dec 6, 2025

Apply

Senior Staff Software Engineer - Model LifeCycle at Crusoe | San Francisco

Crusoe

Full-time|Remote|San Francisco, CA - US

As a Senior Staff Software Engineer specializing in Model LifeCycle at Crusoe, you will play a vital role in shaping the future of software solutions that optimize and enhance our innovative operations. You will lead complex projects, mentor junior engineers, and collaborate with cross-functional teams to deliver high-impact results.

Mar 10, 2026

Apply

Senior Software Engineer - 3D Modeling

Hover

Full-time|$165K/yr - $203K/yr|On-site|san_francisco

At Hover, we empower individuals to design, enhance, and safeguard their cherished properties. Utilizing proprietary AI technology built on over a decade of real property data, we provide answers to pressing questions such as “What will it look like?” and “What will it cost?” Homeowners, contractors, and insurance professionals depend on Hover to receive fully measured, accurate, and interactive 3D models of any property—achieved through a smartphone scan in mere minutes.We are driven by curiosity, purpose, and a collective commitment to our customers, communities, and each other. At Hover, we believe the most innovative ideas stem from diverse perspectives, and we take pride in fostering an inclusive, high-performance culture that encourages growth, accountability, and excellence. Supported by leading investors like Google Ventures and Menlo Ventures, and trusted by industry leaders including Travelers, State Farm, and Nationwide, we are transforming how people perceive and interact with their environments.Why Join Hover?At Hover, 3D models are not just a feature; they are the essence of our product. Each scan and data point we process empowers homeowners, insurers, and contractors to make informed, data-driven decisions. We are seeking a Software Engineer who has a passion for geometry, automation, and making a tangible impact in the real world. In this role, you will design and implement systems that convert customer-captured imagery into meticulously accurate 3D models, enhancing the scalability and precision of Hover’s modeling pipeline. You will work collaboratively with designers and engineers across frontend, backend, computer vision, and DevOps to bring innovative capabilities to fruition, blending technical expertise with strong communication and cross-functional collaboration.The 3D Modeling Pipeline team develops the tools essential for our in-house operations to transform customer-captured scans into highly detailed, accurate 3D models of buildings. This team is also responsible for creating the pipeline and systems that process 3D data through both automated and manual steps, as well as exporting data into customer-facing formats.Your Contributions Will Include:Owning and evolving backend systems that convert raw scan data into exact 3D models, ensuring timely delivery to key ecosystem partners like Xactimate and Cotality.Building and refining internal modeling tools that enable teams to efficiently generate, validate, and optimize high-quality 3D data.Collaborating with machine learning and computer vision engineers to implement new algorithms into production, bridging research with practical applications.Enhancing customer and partner experiences by improving how Hover’s 3D outputs integrate with downstream workflows and external platforms.Promoting innovation and ongoing enhancement across our modeling pipeline.

Mar 19, 2026

Apply

Software Engineer - Productivity and Model Performance

OpenAI

Full-time|On-site|San Francisco

OpenAI is seeking a Software Engineer in San Francisco to focus on improving productivity by optimizing model performance. This position centers on developing solutions that make machine learning models more efficient and effective. Role overview This role involves working closely with teams across different functions to identify and address areas where model performance can be improved. The aim is to deliver changes that have a measurable impact on both systems and workflows. What you will do Collaborate with engineers and other specialists to enhance model efficiency Develop and implement solutions that improve the effectiveness of machine learning systems Contribute to projects that streamline processes and drive productivity gains Impact Your work will help shape improvements in how models operate and how teams at OpenAI achieve their goals. The changes you help deliver will support more effective use of resources and better outcomes for the organization.

Apr 29, 2026

Apply

Staff Software Engineer

Gridware

Full-time|On-site|San Francisco, CA

About GridwareGridware is an innovative technology firm based in San Francisco, committed to safeguarding and optimizing the electrical grid. We have pioneered a revolutionary grid management approach known as Active Grid Response (AGR), which emphasizes the monitoring of electrical, physical, and environmental factors that influence grid reliability and safety. Our cutting-edge AGR platform leverages high-precision sensors to identify potential issues early, facilitating proactive maintenance and fault prevention. This holistic strategy aids in enhancing safety, minimizing outages, and ensuring the grid operates with maximum efficiency. Gridware is supported by prominent climate-tech and Silicon Valley investors. For further details, please visit www.Gridware.io.Role OverviewWe are looking for a talented Staff Software Engineer to act as a pivotal technical force within our team, enhancing the overall software engineering capabilities through architectural innovation, mentorship, and fostering a culture of excellence. In this role, you will design and develop the essential software systems that drive Gridware's platform. This encompasses everything from backend services that oversee our distributed network of devices to the front-end interfaces that visualize grid health, fleet diagnostics, and real-time field events.Your responsibilities will span the entire technology stack, building and scaling systems that integrate hardware, firmware, and cloud infrastructure to enable dependable communication, fleet visibility, and expedited decision-making. This position offers significant ownership and impact, allowing you to influence how our technology supports and protects critical infrastructure at scale.

Nov 8, 2025

Apply

Staff Software Engineer

Broccoli

Full-time|On-site|San Francisco

About BroccoliBroccoli is revolutionizing the $500 billion home services industry by developing an AI operating system designed to empower trades businesses such as HVAC and roofing. Our intelligent AI agents handle customer interactions, manage job bookings, and ensure every lead is effectively captured.With the backing of prominent venture capital firms and a successful $27 million Series A funding round, we are on an aggressive growth trajectory. Collaborating with top private equity-backed home service platforms, we anticipate expanding our team fivefold by 2026, presenting a unique opportunity to join us early and make a significant impact.Why Join Broccoli?As a Staff Engineer, you will be instrumental in establishing the technical backbone of Broccoli AI. Your responsibilities will include ownership of critical systems, influencing architectural decisions, and shaping our development and deployment processes on a large scale.Immediate Impact: Your contributions will directly enhance production systems, benefiting hundreds of customers.Category Creation: Play a pivotal role in defining a new category of AI-powered workforce within an expansive market.Speed & Ownership: Enjoy the advantages of a small team with rapid feedback loops and substantial decision-making authority.Founder Collaboration: Partner closely with experienced founders to drive product and technical vision.What You’ll DoDesign, develop, and scale backend systems and internal tools for our AI agent platform.Take ownership of essential APIs and integrations, including systems like ServiceTitan.Lead complex features from initial design through to production deployment.Enhance real-time voice capabilities, reliability, and intelligence of AI agents.Mentor fellow engineers and help implement best practices across the team.Balance speed and quality while scaling systems to accommodate live customer traffic.What We’re Looking For7+ years of experience in backend or full-stack engineering.Strong system design and architectural skills.Proven experience in deploying and maintaining production systems at scale.Ability to thrive in high-growth, ambiguous startup environments.A proactive approach with a strong execution mindset.

Dec 18, 2025

Apply

Staff Software Engineer

Carta

Full-time|Hybrid|San Francisco, CA; Santa Clara, CA; Seattle, WA; New York, NY

Join Carta's engineering team as a Staff Software Engineer, where you will play a crucial role in developing innovative solutions that enhance our platform. You will collaborate with cross-functional teams to design, implement, and maintain scalable systems, ensuring high performance and responsiveness to requests from the front-end.We're looking for a passionate engineer who thrives in a fast-paced environment and is excited about tackling complex challenges. If you are eager to contribute to cutting-edge technology and drive impactful projects, we want to hear from you!

Mar 6, 2026

Apply

Staff Software Engineer

Amplitude, Inc.

Full-time|On-site|San Francisco, CA

Role overview The Staff Software Engineer position at Amplitude, Inc. is based in San Francisco, CA. This role centers on developing and enhancing software to broaden the platform’s features. Day-to-day work includes direct software development and frequent collaboration with colleagues from various teams. What you will do Design, build, and maintain scalable software applications that support the platform’s growth. Collaborate with product managers and designers to deliver features that address user needs. Mentor junior engineers and contribute to their technical and professional development. Review code and help improve engineering practices throughout the team. Stay current with emerging technologies and industry trends to guide technical choices.

Apr 23, 2026

Apply

Senior Software Engineer - Model Training

Baseten

Full-time|On-site|San Francisco

ABOUT BASETENAt Baseten, we are at the forefront of enabling transformative AI solutions for some of the world's leading companies, including Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma, and Writer. Our innovative platform combines cutting-edge AI research, adaptable infrastructure, and developer-friendly tools to facilitate the production of advanced models. Recently, we celebrated our rapid growth with a successful $300M Series E funding round from notable investors like BOND, IVP, Spark Capital, Greylock, and Conviction. We invite you to join our dynamic team and contribute to the evolution of AI product deployment.THE ROLEAs a Senior Software Engineer specializing in Model Training at Baseten, you will play a pivotal role in constructing the infrastructure essential for the large-scale training and fine-tuning of foundational AI models. Your responsibilities will include designing and implementing distributed training systems, optimizing GPU utilization, and establishing scalable pipelines that empower Baseten and our clientele to adapt models with efficiency and reliability. This role demands a high level of technical expertise and hands-on involvement: you will be responsible for critical components of our training stack, collaborate with product and infrastructure teams to identify customer needs, and drive advancements in scalable training infrastructure.EXAMPLE WORK:Training open-source models that surpass GPT-5 capabilities for a leading digital insurerExploring specialized, continuously learning models as the future of AIOverview of our training documentationResearch initiatives we've undertakenRESPONSIBILITIESDesign, construct, and sustain distributed training infrastructures for large foundation modelsDevelop scalable pipelines for fine-tuning and training across diverse GPU/accelerator clustersEnhance training performance through optimization of algorithms and infrastructureCollaborate closely with cross-functional teams to align technical solutions with business objectivesStay abreast of advancements in the field of machine learning and AI to continually improve our training processes

Aug 29, 2025

Apply

Staff Software Engineer

Crusoe

Full-time|On-site|San Francisco, CA - US

Join our innovative team at Crusoe as a Staff Software Engineer. In this pivotal role, you will leverage your advanced software engineering skills to design, develop, and optimize cutting-edge solutions that enhance our technology stack. Collaborate with cross-functional teams to drive projects from concept to completion, ensuring high-quality deliverables that meet user needs and business objectives.

Mar 19, 2026

Apply

Staff Software Engineer, Nova

AngelList

Full-time|On-site|San Francisco, CA

Why Join AngelListAt AngelList, we tackle some of the most challenging issues in venture capital and private markets. Our team is driven by precision, urgency, and a vision for the future. If you're passionate about transforming the startup funding landscape, this is your opportunity.About AngelListOur mission is to fuel innovation by enhancing the success rate of startups globally. We achieve this by creating the financial infrastructure that facilitates investment in transformative companies. AngelList stands at the intersection of venture capital and the startup ecosystem, supporting over $171 billion in assets and facilitating investments in more than 13,000 startups, including over 300 unicorns. With 57% of premier U.S. VC deals involving AngelList investors, we are ambitious in our goals.If you are excited about shaping the future of private markets, we invite you to join our journey.About the Role: As a Staff Software Engineer for our Nova platform, you will be pivotal in defining and designing our development processes. You will be responsible for establishing the domain model and architectural patterns that will guide our engineering team. If you are a catalyst who can empower a team of eight to perform like a team of twenty by establishing clear patterns, embedding engineering context into our codebase, and driving vital initiatives forward, we want to hear from you.

Mar 12, 2026

Create account — see all 5,877 results