companyCerebras Systems logo

Frontend Inference Compiler - Python/PyTorch Developer in Dubai

On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

Conduct thorough analysis of new models emerging in the generative AI domain and assess their implications on the compilation stack. Develop and maintain software solutions that enhance model performance on Cerebras platforms.

About the job

Cerebras Systems is revolutionizing the AI landscape with the world's largest AI chip, 56 times larger than conventional GPUs. Our innovative wafer-scale architecture delivers the computational power equivalent to dozens of GPUs, all within a single chip, providing unparalleled programming simplicity. This unique approach enables us to achieve industry-leading training and inference speeds, allowing machine learning practitioners to seamlessly execute large-scale ML applications without the complexities of managing numerous GPUs or TPUs. 

Cerebras serves a diverse clientele that includes premier model labs, multinational corporations, and pioneering AI-native startups. Notably, OpenAI has established a multi-year partnership with Cerebras to deploy 750 megawatts of capacity, significantly enhancing key workloads through ultra-high-speed inference. 

With our groundbreaking wafer-scale architecture, Cerebras Inference offers the quickest Generative AI inference solution globally, outpacing GPU-based hyperscale cloud inference services by over 10 times. This remarkable leap in speed is transforming the AI application user experience, enabling real-time iterations and elevating intelligence through enhanced agentic computation. 

About the Role:

Are you eager to contribute to the creation of the fastest Generative Models inference globally? Join the Cerebras Inference Team to help develop a unique software and hardware combination that showcases the best inference capabilities in the market while accommodating the largest models available. 

The Cerebras wafer-scale inference platform facilitates the execution of Generative models at unprecedented speeds, thanks to its unique hardware architecture that ensures rapid access to local memory, ultra-fast interconnects, and a vast amount of computational resources. 

You will work closely with the latest open and closed generative AI models, optimizing them for the Cerebras inference platform. Your responsibilities will encompass model representation, optimization, and the compilation stack to achieve optimal results on current and future Cerebras platforms. 

About Cerebras Systems

Cerebras Systems is at the forefront of AI technology, creating the world's largest AI chip designed to streamline AI computations. We empower organizations to harness the full potential of machine learning without the complexities of traditional GPU systems.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.