companyCerebras Systems logo

AI Inference Deployment Engineer

Cerebras SystemsSunnyvale CA or Toronto Canada
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

The ideal candidate will possess a strong background in deploying and managing AI infrastructures, with expertise in systems architecture and performance optimization. Proficiency in programming languages such as Python or C++, alongside experience with cloud services and GPU/TPU architectures, is highly desirable. A Bachelor's degree in Computer Science, Engineering, or a related field is preferred.

About the job

Cerebras Systems is at the forefront of AI technology, developing the world's largest AI chip that is 56 times greater than conventional GPUs. Our innovative wafer-scale architecture delivers the computational capabilities of numerous GPUs on a single chip, simplifying programming to the level of a single device. This groundbreaking approach enables Cerebras to achieve unmatched training and inference speeds, allowing machine learning practitioners to seamlessly execute large-scale ML applications without the complexities of managing extensive GPU or TPU resources.  

Our clientele includes leading model laboratories, global corporations, and pioneering AI-centric startups. Notably, OpenAI has recently entered into a multi-year partnership with Cerebras, aiming to deploy 750 megawatts of capacity, revolutionizing key workloads with exceptionally rapid inference speeds. 

Thanks to our extraordinary wafer-scale architecture, Cerebras Inference provides the swiftest Generative AI inference solution available today, operating over ten times faster than GPU-based hyperscale cloud inference services. This significant boost in speed is reshaping the user experience in AI applications, facilitating real-time iterations and enhancing intelligence through advanced agentic computation. 

About The Role 

We are looking for an exceptionally talented Deployment Engineer to design and manage our state-of-the-art inference clusters. In this role, you will have the opportunity to work with the unparalleled Wafer-Scale Engine (WSE) and the systems that exploit its extraordinary capabilities.  

About Cerebras Systems

Cerebras Systems is a pioneering company that specializes in the development of advanced AI hardware, specifically the world's largest AI chip, designed to enhance the efficiency and speed of machine learning applications. With a focus on innovative technology and strategic partnerships, Cerebras is transforming the landscape of AI processing.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.