companyCerebras Systems logo

Performance Engineer at Cerebras Systems | Toronto, Ontario

Cerebras SystemsToronto, Ontario, Canada
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

Responsibilities:Concentrate on CPU and memory subsystem optimizations for our Runtime software driver, enabling accelerated cloud and ML training/inference workloads across modern x86 machines that underpin our AI accelerator. Collaborate closely with cross-functional teams to identify performance bottlenecks and implement innovative solutions. Test and validate performance improvements, ensuring robust and reliable system operations. Perform in-depth analysis and profiling of applications to enhance system performance. Stay updated with industry trends and advancements in AI technologies.

About the job

Cerebras Systems is revolutionizing AI with the largest AI chip globally, measuring 56 times larger than traditional GPUs. Our innovative wafer-scale architecture delivers the computational power of numerous GPUs on a single chip, combining unparalleled performance with the simplicity of a single device. This unique approach enables Cerebras to provide leading-edge training and inference speeds, allowing machine learning professionals to effortlessly execute large-scale ML applications without the complexities of managing multiple GPUs or TPUs.

Cerebras counts among its esteemed clients top-tier model laboratories, major global enterprises, and pioneering AI-native startups. Recently, OpenAI announced a multi-year collaboration with Cerebras, aiming to deploy 750 megawatts of power to transform critical workloads with ultra-high-speed inference.

The groundbreaking wafer-scale architecture of Cerebras Inference offers the fastest Generative AI inference solution worldwide, exceeding GPU-based hyperscale cloud inference services by over ten times. This dramatic improvement in speed is reshaping the user experience of AI applications, facilitating real-time iterations and enhancing intelligence through advanced agentic computation.

About The Role
Join Cerebras as a Performance Engineer within our innovative Runtime Team. Our cutting-edge CS-3 system, powered by a network of modern and robust x86 machines, has established new benchmarks in high-performance ML training and inference solutions. Leveraging a chip the size of a dinner plate with 44GB of on-chip memory, this role will challenge and expand your expertise in optimizing AI applications and managing computational workloads primarily on the x86 architecture that supports our Runtime driver.

About Cerebras Systems

Cerebras Systems is at the forefront of AI technology, designing the world’s largest AI chip that revolutionizes machine learning capabilities. Our unique architecture simplifies the deployment of AI applications while delivering unmatched performance, making us a preferred partner for leading enterprises and innovative startups in the AI landscape.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.