Tailoring 0 resumes…

We'll move completed jobs to Ready to Apply automatically.

Software Engineer - Machine Learning Performance at baseten | San Francisco | RoboApply Jobs

Software Engineer - Model Performance

BasetenSan Francisco

On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.

Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

Qualifications:Proven experience in backend software development, particularly in Python or similar programming languages.Strong understanding of machine learning principles and algorithms, with a focus on model optimization techniques.Experience with frameworks such as TensorFlow, PyTorch, or similar ML libraries.Familiarity with performance optimization techniques, particularly for LLMs.Excellent problem-solving skills and a collaborative mindset.A passion for AI and a commitment to continuous learning in the field.

About the job

ABOUT BASETEN

Baseten is at the forefront of AI technology, empowering leading-edge companies like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma, and Writer to seamlessly integrate advanced AI models into their operations. Our unique blend of applied AI research, adaptable infrastructure, and intuitive developer tools enables innovators to bring their most ambitious AI products to life. With our recent $300M Series E funding from top-tier investors such as BOND, IVP, Spark Capital, Greylock, and Conviction, we are poised for rapid growth. Join us in shaping the platform that engineers rely on to deploy transformative AI solutions.

THE ROLE

Are you driven by a passion for enhancing artificial intelligence applications? We are seeking a proactive Software Engineer specializing in ML performance to join our energetic team. This position is perfect for backend engineers who thrive in a fast-paced startup environment and are eager to make substantial contributions to the realm of Large Language Model (LLM) Inference. If you're enthusiastic about optimizing open-source ML models, we can't wait to hear from you!

EXAMPLE INITIATIVES

As a member of our Model Performance team, you will have the opportunity to work on exciting projects, including:

RESPONSIBILITIES

Develop, refine, and implement advanced techniques (quantization, speculative decoding, kv cache reuse, chunked prefill, and LoRA) for ML model inference and infrastructure.
Conduct thorough investigations into the codebases of TensorRT, PyTorch, TensorRT-LLM, vllm, sglang, CUDA, and other libraries to troubleshoot and resolve ML performance issues.
Scale and apply optimization techniques across a diverse array of ML models, with a focus on large language models.

About Baseten

Baseten is a pioneering company dedicated to revolutionizing the AI landscape. We provide essential infrastructure and tools that empower top-tier companies to harness the full potential of AI technology. Our commitment to innovation and excellence has attracted significant investment and a talented workforce, making us a leader in the AI inference domain.

Software Engineer - Model Performance

BasetenSan Francisco

On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.

Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

About the job

ABOUT BASETEN

THE ROLE

EXAMPLE INITIATIVES

As a member of our Model Performance team, you will have the opportunity to work on exciting projects, including:

RESPONSIBILITIES

Develop, refine, and implement advanced techniques (quantization, speculative decoding, kv cache reuse, chunked prefill, and LoRA) for ML model inference and infrastructure.
Conduct thorough investigations into the codebases of TensorRT, PyTorch, TensorRT-LLM, vllm, sglang, CUDA, and other libraries to troubleshoot and resolve ML performance issues.
Scale and apply optimization techniques across a diverse array of ML models, with a focus on large language models.

Software Engineer - Model Performance

Unlock Your Potential

Experience Level

Qualifications

About the job

About Baseten

Direct Appointment Setter at Southern National Roofing | Columbia, MD

Project Superintendent

Community Support Lead Care Manager at Pacific Health Group | Remote

Physical Therapist at Performance Optimal Health | New Canaan

Part-Time In-Home Veterinarian

Sales Support Specialist at Golden Lighting | Tallahassee, FL

New Home Sales Consultant at LGI Homes | Lebanon, TN

Medical Director - Licensed Psychiatrist

Recruiting Coordinator - Join Our Innovative Team

Experienced Litigation Paralegal - Remote

Senior Director of Digital Communications

Nutritional Cook for Early Childhood Center

FMS Analyst at ACT1 Federal | Patuxent River, MD

Automotive Technician Opportunity at Citrus Kia

Software Security Analyst at TP-Link Systems Inc. | Irvine, California

Network Intrusion Detection Engineer - Active TS/SCI with CI Poly

Tax Associate - Private Client

Lead Behavior Technician - Full-Time Position

Local Roofing Sales Representative - Roof Restoration Specialist

Senior Director of Inventory and Merchandise Planning

Software Engineer - Model Performance

Unlock Your Potential

Experience Level

Qualifications

About the job

About Baseten

Software Engineer - Model Performance

Unlock Your Potential

Experience Level

Qualifications

About the job

About Baseten

Software Engineer - Model Performance

Unlock Your Potential

Experience Level

Qualifications

About the job

About Baseten