companyCerebras Systems logo

Full Stack LLM Engineer

Cerebras SystemsToronto, Ontario, Canada
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

Qualifications - Extensive experience in software engineering, particularly in machine learning models and frameworks. - Proficient in Python and relevant programming languages. - Familiarity with deep learning libraries and tools. - Strong problem-solving skills and the ability to work collaboratively in a team setting. - Experience working with high-performance computing systems is a plus.

About the job

At Cerebras Systems, we are revolutionizing AI technology with the world's largest AI chip, which is 56 times larger than traditional GPUs. Our innovative wafer-scale architecture equips users with the computational prowess of multiple GPUs on a single chip, simplifying programming tasks. This groundbreaking approach enables unparalleled training and inference speeds, allowing machine learning practitioners to seamlessly operate extensive ML applications without the complexities of managing numerous GPUs or TPUs. 

Cerebras proudly serves an impressive clientele, ranging from leading model laboratories to major global enterprises and pioneering AI startups. A testament to our capabilities, OpenAI has recently forged a multi-year partnership with Cerebras, aimed at delivering 750 megawatts of scale and transforming critical workloads through ultra-fast inference. 

Our state-of-the-art wafer-scale architecture empowers Cerebras Inference to provide the fastest Generative AI inference solution globally, boasting speeds over ten times faster than GPU-based hyperscale cloud inference services. This extraordinary increase in speed is redefining user experiences with AI applications, enabling real-time iterations and enhancing intelligence through additional agentic computation.

About the Role
We are on the lookout for a dynamic and seasoned engineer to join our Inference Core Model Bringup team. This team is tasked with the rapid deployment of advanced open-source models (such as LLaMA, Qwen, etc.) or proprietary models provided by customers on our Cerebras CSX systems. Success in this role necessitates a systems-oriented generalist who flourishes in fast-paced environments and is adept at navigating the entire Cerebras software stack. Your contributions will be instrumental in achieving unprecedented performance, efficiency, and scalability for AI applications.

About Cerebras Systems

Cerebras Systems is at the forefront of AI technology, creating the largest AI chip in the world, designed for unmatched speed and efficiency. Our innovative solutions are transforming the landscape for machine learning, enabling organizations to achieve superior performance without the complexity of traditional systems.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.