companyPerplexity logo

UK Internship Program at Perplexity | London

Hybrid Internship

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Entry Level

Qualifications

ResponsibilitiesCollaborate with the inference team to enhance serving latency and throughput. Support the introduction of new models and cutting-edge inference optimizations or quantization techniques. Optimize the inference pipeline across the entire stack, from GPU kernels to serving endpoints. QualificationsDemonstrated engineering excellence with a solid grasp of programming fundamentals and languages (including multi-threaded programming, networking, compilation, systems programming, etc). Currently pursuing a Master’s or PhD in Computer Science with an emphasis on performance-related topics (such as HPC, Compilers, Distributed Systems). Familiarity with machine learning frameworks (e.g., Torch, JAX). Experience in GPU programming (e.g., CUDA, Triton). Background in High-Performance Computing (e.g., OpenMPI).

About the job

Perplexity is thrilled to introduce our Internship Program, designed for outstanding Master’s or PhD students specializing in Computer Science or Engineering in the UK for the 2025-2026 academic year. This immersive program offers a direct collaboration with our AI Inference team, providing a distinctive opportunity to acquire invaluable experience within a rapidly expanding AI startup. Exceptional interns may receive an offer for a full-time position upon completion of the program.

Our AI Inference team is integral to the performance of Perplexity's products, overseeing the inference engine and deployments for models ranging from single-node embeddings to advanced distributed sparse Mixture-of-Experts models, all while managing extensive GPU clusters. With a focus on optimizing latency and throughput, the Inference team encompasses the entire serving stack, from GPU kernels to networking and monitoring infrastructure.

About Perplexity

Perplexity is a dynamic and innovative AI startup dedicated to advancing artificial intelligence technologies. We pride ourselves on creating a collaborative environment where talented individuals can thrive and contribute to groundbreaking projects that shape the future of AI.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.