companyJane Street logo

Machine Learning Performance Engineer

Jane StreetNew York, New York, United States
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

We are seeking candidates with a diverse range of skills and experiences, including: A solid understanding of contemporary ML techniques and toolsets. Experience with debugging the performance of training runs from start to finish. Low-level GPU knowledge, including PTX, SASS, warps, cooperative groups, Tensor Cores, and memory hierarchy. Proficiency in debugging and optimization tools such as CUDA GDB, NSight Systems, and NSight Compute. Familiarity with libraries like Triton, CUTLASS, CUB, Thrust, cuDNN, and cuBLAS. Intuition regarding the latency and throughput characteristics of CUDA graph launches, tensor core arithmetic, warp-level synchronization, and asynchronous memory loads. Experience with Infiniband, RoCE, GPUDirect, PXN, rail optimization, and NVLink, particularly in connecting GPU clusters. Knowledge of collective algorithms that support distributed GPU training in NCCL or MPI. An inventive mindset and the willingness to critically evaluate our strategies and tools. Fluency in English.

About the job

Join our innovative Machine Learning team at Jane Street as a Performance Engineer, where your expertise in low-level systems programming and optimization will play a critical role in enhancing our machine learning capabilities.

Machine learning is a vital component of Jane Street's global operations. Our dynamic trading environment acts as a unique, rapid-feedback platform for ML experimentation, allowing us to seamlessly integrate new concepts and methodologies.

Your primary responsibility will be to optimize the performance of our models during both the training and inference phases. We prioritize efficient large-scale training, low-latency inference in real-time systems, and high-throughput inference in research scenarios. This involves not only refining CUDA implementations but also taking a holistic approach that encompasses storage systems, networking, as well as host and GPU-level considerations. We aim to ensure that our platform operates efficiently at the lowest levels—questioning whether high throughput translates into effective goodput and analyzing the actual time taken to load vectors from the L2 cache.

If you're curious and passionate about tackling complex problems, you’ll find a welcoming environment here, even if you haven't previously considered a career in finance.

About Jane Street

Jane Street is a leading quantitative trading firm and liquidity provider, known for its collaborative and innovative approach to technology and finance. At Jane Street, we take pride in our culture of curiosity and continuous learning, fostering an environment where talented individuals can thrive and contribute to our success.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.