Tailoring 0 resumes…

We'll move completed jobs to Ready to Apply automatically.

Director of Inference Kernels at Etched | San Jose | RoboApply Jobs

This job posting is no longer active and is not accepting applications.

Director of Inference Kernels

EtchedSan Jose

On-site Full-time $200K/yr - $300K/yr

No Longer Active

Experience Level

Senior Level Manager

Qualifications

Qualifications:Proven experience in leading high-performance engineering teams.Extensive knowledge of AI inference systems and transformer architectures.Strong background in algorithm optimization and hardware-software co-design.Expertise in performance benchmarking and regression testing.Exceptional problem-solving skills and ability to drive innovation.Excellent leadership and team-building capabilities.

About the job

About Etched

Etched is pioneering the development of the world's first AI inference system specifically designed for transformers, achieving over 10x the performance and significantly reduced cost and latency compared to conventional systems like the B200. With Etched's custom ASICs, we enable the creation of innovative products, such as real-time video generation models and advanced deep reasoning agents, that are unattainable with traditional GPUs. With substantial backing from prestigious investors and a team of elite engineers, Etched is transforming the infrastructure landscape of the most rapidly advancing industry.

Job Overview
As a key leader in our organization, you will spearhead a dynamic team dedicated to crafting a comprehensive suite of optimized kernels and deploying high-performance inference stacks for various cutting-edge transformer models (e.g., Llama-3, Llama-4, Deepseek-R1, Qwen-3, Stable Diffusion-3, etc.). Your role will involve managing and expanding a high-caliber team focused on developing innovative model mapping techniques while co-designing inference-time algorithms (e.g., speculative and parallel decoding, prefill-decode disaggregation, etc.).

Key Responsibilities

Architect Superior Inference Performance: Achieve continuous batching throughput that exceeds B200 by at least 10x for high-priority tasks.
Create High-Performance Inference Mega Kernels: Design intricate, fused kernels that optimize chip utilization and minimize inference latency, ensuring validation through benchmarking and regression testing in live production environments.
Develop Model Mapping Strategies: Implement system-level enhancements using a combination of tensor parallelism and expert parallelism to maximize performance.
Innovate Hardware-Software Co-design: Create and implement production-ready algorithmic enhancements in inference time (e.g., speculative decoding, prefill-decode disaggregation, KV cache offloading).
Build a Scalable Team: Recruit and retain a team of exceptional inference optimization engineers.
Align Cross-Functional Performance Goals: Ensure that inference stack and performance targets align with the software infrastructure teams (e.g., runtime support, scheduling).

About Etched

Etched is at the forefront of AI technology, committed to reshaping the landscape of inference systems designed for transformers. Our groundbreaking innovations are backed by top-tier investors and driven by a team of leading engineers, making us a key player in the fast-evolving AI industry.

This job posting is no longer active and is not accepting applications.

Director of Inference Kernels

EtchedSan Jose

On-site Full-time $200K/yr - $300K/yr

No Longer Active

Experience Level

Senior Level Manager

Qualifications

About the job

About Etched

Key Responsibilities

Architect Superior Inference Performance: Achieve continuous batching throughput that exceeds B200 by at least 10x for high-priority tasks.
Create High-Performance Inference Mega Kernels: Design intricate, fused kernels that optimize chip utilization and minimize inference latency, ensuring validation through benchmarking and regression testing in live production environments.
Develop Model Mapping Strategies: Implement system-level enhancements using a combination of tensor parallelism and expert parallelism to maximize performance.
Innovate Hardware-Software Co-design: Create and implement production-ready algorithmic enhancements in inference time (e.g., speculative decoding, prefill-decode disaggregation, KV cache offloading).
Build a Scalable Team: Recruit and retain a team of exceptional inference optimization engineers.
Align Cross-Functional Performance Goals: Ensure that inference stack and performance targets align with the software infrastructure teams (e.g., runtime support, scheduling).

Director of Inference Kernels

Experience Level

Qualifications

About the job

About Etched

Direct Appointment Setter at Southern National Roofing | Columbia, MD

Project Superintendent

Community Support Lead Care Manager at Pacific Health Group | Remote

Physical Therapist at Performance Optimal Health | New Canaan

Part-Time In-Home Veterinarian

Sales Support Specialist at Golden Lighting | Tallahassee, FL

New Home Sales Consultant at LGI Homes | Lebanon, TN

Medical Director - Licensed Psychiatrist

Recruiting Coordinator - Join Our Innovative Team

Experienced Litigation Paralegal - Remote

Senior Director of Digital Communications

Nutritional Cook for Early Childhood Center

FMS Analyst at ACT1 Federal | Patuxent River, MD

Automotive Technician Opportunity at Citrus Kia

Software Security Analyst at TP-Link Systems Inc. | Irvine, California

Network Intrusion Detection Engineer - Active TS/SCI with CI Poly

Tax Associate - Private Client

Lead Behavior Technician - Full-Time Position

Local Roofing Sales Representative - Roof Restoration Specialist

Senior Director of Inventory and Merchandise Planning

Director of Inference Kernels

Experience Level

Qualifications

About the job

About Etched

Director of Inference Kernels

Experience Level

Qualifications

About the job

About Etched

Director of Inference Kernels

Experience Level

Qualifications

About the job

About Etched