About the job
About the Role
We're excited to invite you to join wafer as a Spring Intern, where you will play a crucial role in shaping the future of AI infrastructure and GPU optimization. As part of our innovative team, you will work closely with full-time engineers to define our technical strategies and contribute to the development of the essential systems that drive our GPU optimization platform.
Your Responsibilities
Design and implement scalable infrastructure for AI model training and inference tasks.
Guide the team in making technical decisions and architectural choices.
Qualifications We Seek
Essential Technical Skills
GPU Fundamentals: A strong grasp of GPU architectures, CUDA programming, and parallel computing methodologies.
Deep Learning Frameworks: Skilled in PyTorch, TensorFlow, or JAX, especially for GPU-accelerated applications.
Knowledge of LLM/AI: Solid foundation in large language models, including training, fine-tuning, prompting, and evaluation.
Systems Engineering: Proficient in C++, Python, and potentially Rust/Go for developing tools around CUDA.
Preferred Background
Publications or contributions to open-source projects related to inference GPU computing or ML/AI are advantageous.
Hands-on experience in conducting large-scale experiments, benchmarking, and performance optimization.

