About the job
Cerebras Systems is pioneering the development of the world's largest AI chip, which is 56 times larger than traditional GPUs. Our innovative wafer-scale architecture delivers the AI computational power equivalent to dozens of GPUs on a single chip, combined with the simplicity of programming a single device. This groundbreaking approach enables us to provide unparalleled training and inference speeds, allowing machine learning practitioners to seamlessly execute large-scale ML applications without the complications of managing multiple GPUs or TPUs.
Cerebras Systems proudly serves a diverse clientele, including leading model laboratories, global enterprises, and innovative AI-native startups. Recently, OpenAI has entered into a multi-year partnership with us, aiming to harness 750 megawatts of scale to revolutionize key workloads through ultra-high-speed inference.
Thanks to our unique wafer-scale architecture, Cerebras Inference stands as the fastest Generative AI inference solution globally, achieving speeds over 10 times faster than GPU-based hyperscale cloud inference services. This significant speed enhancement transforms the user experience of AI applications, facilitating real-time iterations and boosting intelligence through additional agentic computation.

