companyEtched logo

AI Architecture Intern - Inference

EtchedSan Jose
On-site Internship

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Entry Level

Qualifications

QualificationsCurrently pursuing a Bachelor’s, Master’s, or PhD in Computer Science, Computer Engineering, Applied Mathematics, or a related discipline. Strong programming skills in Python and C++. Familiarity with performance-sensitive or complex distributed software systems, such as Linux internals, accelerator architectures (e.g., GPUs, TPUs), and compilers.

About the job

AI Architecture Intern - Inference
Location:
San Jose, CA
Team: Architecture

About Etched

At Etched, we are pioneering the development of the world’s first AI inference system specifically designed for transformers, achieving over 10 times the performance and significantly reduced cost and latency compared to traditional systems. Our innovative ASICs empower the creation of groundbreaking products, enabling real-time video generation and advanced reasoning agents that are unattainable with conventional GPUs. Supported by substantial investments from leading venture capital firms and staffed by top-tier engineering talent, Etched is at the forefront of transforming the infrastructure of the fastest-growing industry.

The Role

We are in search of a motivated Architecture Intern to join our dynamic team, contributing to the design and optimization of next-generation AI accelerators. This role will involve developing and fine-tuning compute architectures that deliver outstanding performance and efficiency for transformer workloads. Throughout your internship, you will tackle cutting-edge architectural challenges and engage in performance modeling.

Key Responsibilities

  • Assist in adapting state-of-the-art models to our architecture and develop programming abstractions and testing capabilities for rapid model iteration.

  • Help enhance and scale Sohu’s runtime, including multi-node inference, intra-node execution, state management, and robust error handling.

  • Contribute to the optimization of routing and communication layers utilizing Sohu’s collectives.

  • Employ performance profiling and debugging tools to pinpoint bottlenecks and correctness issues.

  • Gain a deep understanding of Sohu to collaboratively design hardware instructions and model architecture operations to maximize performance.

  • Implement high-performance software components for the Model Toolkit.

Qualifications

  • Currently pursuing a Bachelor’s, Master’s, or PhD in Computer Science, Computer Engineering, Applied Mathematics, or a related discipline.

  • Strong programming skills in Python and C++.

  • Familiarity with performance-sensitive or complex distributed software systems, such as Linux internals, accelerator architectures (e.g., GPUs, TPUs), and compilers.

About Etched

Etched is at the cutting edge of AI technology, creating a uniquely capable inference system designed for transformers that significantly outperforms traditional solutions. Our technology not only enhances performance but also reduces operational costs and latencies, setting new standards in the industry. With a strong backing from elite investors and a team of leading engineers, we are revolutionizing the infrastructure for the AI sector.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.