companyScale AI logo

Senior AI Infrastructure Engineer - Model Serving Platform

Scale AISan Francisco, CA; New York, NY
On-site Full-time $216.2K/yr - $270.3K/yr

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Senior

Qualifications

Key Responsibilities:Develop and sustain fault-tolerant, high-performance systems capable of serving LLM workloads at scale. Create an internal platform to facilitate LLM capability discovery. Collaborate with researchers and engineers to integrate and enhance models for both production and research purposes. Conduct architecture and design reviews to ensure adherence to best practices in system design and scalability. Implement monitoring and observability solutions to guarantee system health and performance. Lead projects from inception to completion in a cross-functional team setting. Preferred Qualifications:5+ years of experience in developing large-scale, high-performance backend systems. Proficient programming skills in languages such as Python, Go, Rust, or C++. Familiarity with LLM serving and routing principles including rate limiting, token streaming, load balancing, and budgeting. Understanding of LLM capabilities and concepts such as reasoning, tool calling, and prompt templates. Experience with containerization and orchestration tools like Docker and Kubernetes. Knowledge of cloud infrastructures such as AWS and GCP, along with infrastructure as code practices (e.g., Terraform). Demonstrated ability to independently tackle complex challenges in a fast-paced environment. Desirable Skills:Experience with contemporary LLM serving frameworks like vLLM, SGLang, TensorRT-LLM, or text-generation-inference.

About the job

Join our dynamic Machine Learning Infrastructure team as a Senior AI Infrastructure Engineer, where you will play a pivotal role in designing and constructing platforms that ensure the scalable, reliable, and efficient serving of Large Language Models (LLMs). Our innovative platform supports a range of cutting-edge research and production systems, catering to both internal and external applications across diverse environments.

The ideal candidate will possess a solid foundation in machine learning principles coupled with extensive experience in backend system architecture. You will thrive in a collaborative environment that bridges research and engineering, working diligently to provide seamless experiences for our customers and accelerating innovation across the organization.

About Scale AI

Scale AI is at the forefront of advancing artificial intelligence technology, providing a robust platform that empowers organizations to harness the power of machine learning. Our commitment to innovation and excellence drives us to create systems that enhance productivity and foster groundbreaking research.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.