companyReflection AI logo

Mid-Level Technical Staff - Training Infrastructure

Reflection AISan Francisco
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

Desired ExperienceProven track record in deploying and managing large-scale GPU systems for inference or model serving. Several years of practical experience in developing and maintaining production infrastructure. Deep understanding of GPU performance characteristics and optimization strategies. Hands-on experience with modern inference frameworks such as SGLang, Megatron, or other high-performance LLM runtimes. Familiarity with distributed reinforcement learning infrastructure or rollout generation systems. Experience in optimizing throughput for large-scale model execution workloads. In-depth experience with GPU kernels or low-level performance optimization techniques.

About the job

Our Vision

At Reflection AI, we are on a mission to develop open superintelligence and democratize its access for everyone.

Our team, hailing from renowned organizations like DeepMind, OpenAI, Google Brain, Meta, Character. AI, and Anthropic, is dedicated to creating open weight models that cater to individuals, enterprises, and even nations.

Role Overview

  • Design, construct, and manage state-of-the-art GPU infrastructure for high-throughput model inference and mid-training processes.

  • Develop systems that facilitate synthetic data generation and reinforcement learning pipelines at scale.

  • Create high-performance inference platforms capable of serving and evaluating models across thousands of GPUs.

  • Optimize throughput, latency, and GPU utilization for large language model inference and deployment tasks.

  • Construct infrastructure that enhances reinforcement learning pipelines, including large-scale rollout generation, evaluation, and policy enhancement loops.

  • Collaborate closely with research teams to support distributed reinforcement learning workloads and extensive model evaluation infrastructure.

  • Enhance model execution performance through kernel-level optimization, model parallelism strategies, and GPU runtime improvements.

  • Develop distributed systems that enable large-scale synthetic data generation and reinforcement learning-driven training workflows.

  • Identify and address performance bottlenecks across inference runtimes, GPU kernels, networking, and distributed computing systems.

About Reflection AI

Reflection AI is committed to pioneering the development of open superintelligence, making advanced AI technologies accessible to all. Our diverse team of experts from leading AI organizations is focused on building innovative solutions that cater to a wide range of users, from individuals to large institutions.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.