companyEtched logo

Inference Software Engineer

EtchedSan Jose
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

QualificationsProficiency in C++ or Rust. Understanding of performance-sensitive or complex distributed software systems like Linux internals, accelerator architectures (e.g. GPUs, TPUs), Compilers, or high-speed interconnects (e.g. NVLink, InfiniBand). Familiarity with PyTorch or JAX. Experience in porting applications to non-standard accelerator hardware or platforms.

About the job

About Etched

Etched is pioneering the world's first AI inference system specifically designed for transformers, achieving over 10x higher performance, significantly reduced costs, and minimal latency compared to B200 systems. Our custom ASICs enable the development of innovative products that were previously unattainable with GPUs, such as real-time video generation models and advanced chain-of-thought reasoning agents. With substantial backing from leading investors and a team of top engineers, Etched is revolutionizing the infrastructure of one of the fastest-growing industries in history.

Key Responsibilities

  • Assist in porting cutting-edge models to our architecture and contribute to the development of programming abstractions and testing capabilities to streamline the model porting process.

  • Develop, enhance, and scale Sohu’s runtime, focusing on multi-node inference, intra-node execution, state management, and effective error handling.

  • Optimize routing and communication layers utilizing Sohu's collectives.

  • Employ performance profiling and debugging tools to pinpoint bottlenecks and correctness challenges.

Ideal Candidate Profile

  • Strong proficiency in C++ or Rust programming languages.

  • Solid understanding of performance-critical and complex distributed software systems, including Linux internals, accelerator architectures (e.g., GPUs, TPUs), compilers, and high-speed interconnects (e.g., NVLink, InfiniBand).

  • Familiarity with machine learning frameworks such as PyTorch or JAX.

  • Experience in porting applications to non-standard accelerator hardware or platforms.

Preferred Qualifications

  • Experience in developing low-latency, high-performance applications using both kernel-level and user-space networking stacks.

  • In-depth understanding of distributed systems concepts, algorithms, and challenges, including consensus protocols and communication patterns.

  • Thorough knowledge of Transformer architectures, particularly Mixture-of-Experts (MoE).

  • Experience building applications with substantial SIMD (Single Instruction, Multiple Data) optimizations for performance-critical paths.

Benefits

  • Health Insurance
  • 401k

About Etched

Etched is at the forefront of AI technology, creating specialized systems that outperform traditional hardware. Our innovations empower developers to craft advanced AI applications while ensuring cost efficiency and performance excellence. Join us in shaping the future of AI infrastructure.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.