companyDatabricks logo

Staff Software Engineer, Model Serving

DatabricksSan Francisco, California
On-site Full-time $192K/yr - $260K/yr

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Senior

Qualifications

The impact you will have: Design and implement essential systems and APIs that drive Databricks Model Serving, ensuring scalability, reliability, and operational excellence. Collaborate with product and engineering leadership to outline the technical roadmap and long-term architecture for serving workloads. Guide architectural decisions to enhance performance, throughput, autoscaling, and operational efficiency for CPU and GPU serving workloads. Contribute directly to critical components across the serving infrastructure — from model container builds and deployment workflows to runtime systems such as routing, caching, observability, and intelligent autoscaling — ensuring seamless operations at scale. Work cross-functionally with product, platform, and research teams to convert customer needs into dependable and high-performing systems. Lead initiatives that enhance latency, availability, and cost-effectiveness across both customer-facing and foundational serving layers. Establish best practices for code quality, testing, and operational readiness, while mentoring fellow engineers through design reviews and technical guidance. Represent the team in cross-organizational technical discussions, influencing the broader AI platform strategy at Databricks.

About the job

At Databricks, we are dedicated to empowering data teams to tackle the most challenging problems in the world — from realizing the future of transportation to fast-tracking medical innovations. We accomplish this by developing and operating the premier data and AI infrastructure platform, enabling our customers to harness profound data insights for business enhancement.

Our Model Serving product equips organizations with a cohesive, scalable, and governed solution for deploying and managing AI/ML models — ranging from traditional machine learning to intricate proprietary large language models. It ensures real-time, low-latency inference, governance, monitoring, and lineage. As the adoption of AI surges, Model Serving stands as a fundamental component of the Databricks platform, allowing customers to operationalize models at scale with robust SLAs and cost efficiency.

In the role of Staff Engineer, you will significantly influence both the product experience and the core infrastructure of Model Serving. Your responsibilities will include designing and constructing systems that facilitate high-throughput, low-latency inference across CPU and GPU workloads, steering architectural strategies, and collaborating extensively with platform, product, infrastructure, and research teams to create an exceptional serving platform.

About Databricks

Databricks is at the forefront of data and AI innovation, committed to equipping data teams with the tools needed to solve pressing global challenges. Our platform allows organizations to leverage comprehensive data insights to drive significant business improvements.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.