AI Inference Deployment Engineer

Cerebras SystemsSunnyvale CA or Toronto Canada

On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.

Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

The ideal candidate will possess a strong background in deploying and managing AI infrastructures, with expertise in systems architecture and performance optimization. Proficiency in programming languages such as Python or C++, alongside experience with cloud services and GPU/TPU architectures, is highly desirable. A Bachelor's degree in Computer Science, Engineering, or a related field is preferred.

About the job

Our clientele includes leading model laboratories, global corporations, and pioneering AI-centric startups. Notably, OpenAI has recently entered into a multi-year partnership with Cerebras, aiming to deploy 750 megawatts of capacity, revolutionizing key workloads with exceptionally rapid inference speeds.

Thanks to our extraordinary wafer-scale architecture, Cerebras Inference provides the swiftest Generative AI inference solution available today, operating over ten times faster than GPU-based hyperscale cloud inference services. This significant boost in speed is reshaping the user experience in AI applications, facilitating real-time iterations and enhancing intelligence through advanced agentic computation.

About The Role

We are looking for an exceptionally talented Deployment Engineer to design and manage our state-of-the-art inference clusters. In this role, you will have the opportunity to work with the unparalleled Wafer-Scale Engine (WSE) and the systems that exploit its extraordinary capabilities.

About Cerebras Systems

Cerebras Systems is a pioneering company that specializes in the development of advanced AI hardware, specifically the world's largest AI chip, designed to enhance the efficiency and speed of machine learning applications. With a focus on innovative technology and strategic partnerships, Cerebras is transforming the landscape of AI processing.

Similar jobs

1 - 20 of 684 Jobs

Search for R D Engineer Advanced Technology In Ai Ml And Hpc

684 results

Select all on this page (20)

Apply

R&D Engineer - Advanced Technology in AI/ML and HPC

Cerebras Systems

Full-time|On-site|Sunnyvale, CA; Toronto, Ontario, Canada; Vancouver, British Columbia, Canada

Join Cerebras Systems as an R&D Engineer specializing in Advanced Technology, focusing on Artificial Intelligence (AI) and Machine Learning (ML) within High-Performance Computing (HPC). In this pivotal role, you will contribute to cutting-edge projects that drive innovation in AI and ML technologies.As part of our dynamic team, you will collaborate with top-tier engineers and researchers to develop revolutionary solutions that enhance computing capabilities. Your expertise will be instrumental in shaping the future of AI and HPC technologies.

Apr 7, 2026

Apply

AI/ML Research Scientist in Advanced Technology

Cerebras Systems

Full-time|On-site|Sunnyvale, CA; Toronto, Ontario, Canada; Vancouver, British Columbia, Canada

Join Cerebras Systems as an AI/ML Research Scientist and be part of a pioneering team at the forefront of advanced technology. In this role, you will leverage your expertise in artificial intelligence and machine learning to develop innovative solutions that will revolutionize the field. Collaborate with top-tier researchers and engineers to push the boundaries of what's possible.

Apr 7, 2026

Apply

Senior R&D Mechanical Engineer – Flexible Instrumentation

Intuitive Surgical, Inc.

Full-time|On-site|Sunnyvale

Join our innovative team at Intuitive Surgical as a Senior R&D Mechanical Engineer specializing in flexible instrumentation. This role involves designing, developing, and testing advanced medical devices that enhance surgical precision and patient outcomes. You will collaborate closely with cross-functional teams to drive project success and ensure compliance with regulatory standards.

Jan 27, 2026

Apply

Engineering Manager - Inference ML Runtime

Cerebras Systems

Full-time|On-site|Sunnyvale CA or Toronto Canada

Join Cerebras Systems as an Engineering Manager specializing in Inference ML Runtime, where you will lead a dedicated team in developing groundbreaking machine learning solutions. Your expertise will guide the design and implementation of our inference runtime, ensuring efficiency and performance at scale.As a pivotal leader in our innovative environment, you will collaborate with cross-functional teams, driving the development of state-of-the-art algorithms and systems that push the boundaries of artificial intelligence.

Mar 24, 2026

Apply

Staff ML Performance Engineer - Training Efficiency

Wayve Technologies

Full-time|On-site|Sunnyvale

Join Wayve Technologies as a Staff Machine Learning Performance Engineer, specializing in Training Efficiency. In this pivotal role, you will be responsible for enhancing the performance of our machine learning models and algorithms, ensuring they operate at peak efficiency. You will collaborate with cross-functional teams to develop innovative solutions that improve training processes, optimize model performance, and drive impactful results in autonomous vehicle technology.

Feb 27, 2026

Apply

Lead Advanced Packaging Technologist

Cerebras Systems

Full-time|$175K/yr - $275K/yr|On-site|Sunnyvale, CA

Cerebras Systems is at the forefront of AI technology, having developed the largest AI chip globally, which is 56 times larger than traditional GPUs. Our innovative wafer-scale architecture delivers the computational power typically associated with dozens of GPUs, all within a single chip, providing unparalleled programming simplicity. This revolutionary approach enables us to achieve unmatched training and inference speeds, empowering machine learning practitioners to run extensive ML applications seamlessly, without the complexities of managing multiple GPUs or TPUs.Cerebras serves a diverse clientele, including leading model labs, major global enterprises, and pioneering AI-native startups. Recently, OpenAI announced a multi-year partnership with Cerebras to leverage 750 megawatts of scalable power, significantly enhancing critical workloads through ultra-high-speed inference.The groundbreaking architecture of Cerebras Inference provides the fastest Generative AI inference solution available, boasting speeds over ten times faster than GPU-based hyperscale cloud inference services. This remarkable acceleration is transforming user experiences in AI applications, enabling real-time iterations and enhancing intelligence through advanced computation techniques.Lead Advanced Packaging TechnologistWe are looking for an experienced Lead Advanced Packaging Technologist to spearhead the development, integration, and deployment of next-generation semiconductor packaging technologies. This pivotal role involves architecting and implementing advanced, high-performance, and high-density packaging solutions that support cutting-edge compute, AI, and heterogeneous integration platforms.

Feb 23, 2026

Apply

Software Engineer - Specializing in Axion Data Engine and ML Ops

Applied Intuition

Full-time|On-site|Sunnyvale, California, United States

Applied Intuition is hiring a Software Engineer in Sunnyvale, California, with a focus on the Axion Data Engine and machine learning operations. This role centers on building and supporting the systems that power advanced data processing and ML workflows. Key Responsibilities Collaborate with cross-functional teams to design, build, and deploy data solutions for the Axion Data Engine. Maintain and enhance machine learning operations, aiming to improve system reliability and performance. Develop data processing capabilities that meet high standards for efficiency and accuracy. Team and Impact This position works closely with engineers and specialists from multiple areas. The work directly supports the quality and precision needed in industries that rely on advanced data and machine learning tools.

Apr 28, 2026

Apply

Kernel Engineer

Cerebras Systems

Full-time|On-site|Sunnyvale CA or Toronto Canada

Cerebras Systems is revolutionizing artificial intelligence with the world's largest AI chip, 56 times larger than traditional GPUs. Our innovative wafer-scale architecture delivers unparalleled AI compute power, equating to dozens of GPUs on a single chip, all while maintaining the programming simplicity of a single device. This unique solution enables Cerebras to achieve unmatched training and inference speeds, allowing machine learning practitioners to seamlessly execute large-scale ML applications without the complexities of managing multiple GPUs or TPUs.We proudly serve a diverse clientele that includes leading model labs, multinational corporations, and pioneering AI-native startups. Notably, OpenAI has recently entered into a multi-year partnership with Cerebras, harnessing 750 megawatts of scale to transform critical workloads with ultra-high-speed inference.Our cutting-edge wafer-scale architecture powers the fastest Generative AI inference solution globally, boasting speeds over ten times faster than GPU-based hyperscale cloud inference services. This remarkable acceleration is reshaping the user experience of AI applications, facilitating real-time iterations and enhancing intelligence through advanced agentic computation.About The RoleAs a Kernel Engineer, you will be pivotal in crafting high-performance software solutions at the convergence of hardware and software. Your primary responsibility will be to implement, optimize, and scale deep learning operations that fully utilize our custom, massively parallel processor architecture.You will collaborate with a world-class team focused on designing, tuning for performance, and validating foundational ML and HPC kernels. This role includes building a comprehensive library of parallel and distributed algorithms aimed at maximizing compute utilization and enhancing training efficiency for state-of-the-art AI models. Your contributions will be crucial in unlocking the full capabilities of our hardware and accelerating the advancements in AI.

Feb 23, 2026

Apply

Engineering Manager, AI at Coram AI | Sunnyvale

Coram AI

Full-time|On-site|Sunnyvale

At Coram AI, we are transforming the landscape of video security in the digital age. Our innovative cloud-native platform leverages advanced computer vision and artificial intelligence to empower businesses with enhanced safety, smarter decision-making capabilities, and accelerated operational efficiency through features like real-time alerts, effortless clip sharing, and comprehensive multi-site visibility.Join our dynamic and agile team that prioritizes clarity, craftsmanship, and impactful contributions. Every team member plays a crucial role, delivering significant results and shaping the future of AI-driven security solutions.We are seeking an experienced Engineering Manager to lead our talented AI team at Coram. This team, although small, is exceptionally skilled and operates at the forefront of real-time systems, computer vision, and generative AI.In this hands-on leadership role, you will blend technical guidance, architectural oversight, recruitment, and team management. The ideal candidate will possess up-to-date knowledge of modern deep learning and generative AI, along with substantial experience in building and leading high-performance teams.

Mar 3, 2026

Apply

Senior Clinical Research Engineer - Advanced Energy

Intuitive Surgical, Inc.

Full-time|On-site|Sunnyvale

We are seeking a highly skilled Senior Clinical Research Engineer with expertise in advanced energy systems to join our dynamic team at Intuitive Surgical. In this role, you will play a pivotal part in driving innovative research initiatives, contributing to the development and enhancement of cutting-edge medical technologies. Your ability to analyze complex data, collaborate with multidisciplinary teams, and communicate findings effectively will be crucial in advancing our product offerings.As a Senior Clinical Research Engineer, you will be responsible for designing and conducting experiments, ensuring compliance with industry standards, and presenting your research findings to stakeholders. If you are passionate about improving patient outcomes through advanced technology, we want to hear from you!

Dec 17, 2025

Apply

Senior Site Reliability Engineer for AI/ML Innovations

Intuitive Surgical, Inc.

Full-time|On-site|Sunnyvale

Join our dynamic team as a Senior Site Reliability Engineer focused on AI/ML solutions. In this role, you will leverage your expertise to enhance the reliability, scalability, and performance of our cutting-edge AI-driven products. You will work collaboratively with cross-functional teams to design, implement, and maintain robust systems that support our mission to revolutionize surgical technology.

Dec 25, 2025

Apply

AI Research Engineer in Robotics

Coram AI

Full-time|On-site|Sunnyvale

At Coram AI, we are revolutionizing video security for the contemporary landscape. Our innovative cloud-native platform leverages advanced computer vision and artificial intelligence to empower businesses to enhance safety, facilitate informed decision-making, and accelerate operations. This includes features such as real-time alerts, effortless clip sharing, and comprehensive visibility across multiple locations.Joining our agile and dynamic team means being part of a collaborative environment that prioritizes clarity, excellence, and impactful contributions. Every team member has a voice, delivers significant work, and plays a crucial role in shaping how AI can foster a safer and more interconnected world.We are seeking engineers who thrive at the nexus of robotics, real-time systems, and deep learning. This position focuses on implementing high-performance vision and multimodal models on robotic platforms, where factors such as latency, reliability, and hardware limitations are paramount.

Mar 11, 2026

Apply

Software Engineer - Robotics at Coram AI | Sunnyvale

Coram AI

Full-time|On-site|Sunnyvale

At Coram AI, we are revolutionizing video security for the contemporary landscape. Our innovative cloud-native platform leverages computer vision and artificial intelligence to empower businesses to enhance safety, make informed decisions, and accelerate operations, featuring real-time alerts, effortless clip sharing, and multi-site visibility.Joining our dynamic and agile team means becoming part of a culture that prioritizes clarity, quality, and impactful contributions. Every team member has a voice, delivers significant work, and plays a crucial role in shaping how AI can foster a safer and more interconnected world.We seek an exceptionally skilled software engineer to develop high-performance, real-time software that operates on edge devices while adhering to stringent latency and memory limitations. This position emphasizes deterministic execution, distributed system architecture, and low-level performance enhancements. You will focus on constructing the infrastructure and runtime systems that enable real-time robotics applications.

Mar 11, 2026

Apply

AI Inference Deployment Engineer

Cerebras Systems

Full-time|On-site|Sunnyvale CA or Toronto Canada

Cerebras Systems is at the forefront of AI technology, developing the world's largest AI chip that is 56 times greater than conventional GPUs. Our innovative wafer-scale architecture delivers the computational capabilities of numerous GPUs on a single chip, simplifying programming to the level of a single device. This groundbreaking approach enables Cerebras to achieve unmatched training and inference speeds, allowing machine learning practitioners to seamlessly execute large-scale ML applications without the complexities of managing extensive GPU or TPU resources. Our clientele includes leading model laboratories, global corporations, and pioneering AI-centric startups. Notably, OpenAI has recently entered into a multi-year partnership with Cerebras, aiming to deploy 750 megawatts of capacity, revolutionizing key workloads with exceptionally rapid inference speeds. Thanks to our extraordinary wafer-scale architecture, Cerebras Inference provides the swiftest Generative AI inference solution available today, operating over ten times faster than GPU-based hyperscale cloud inference services. This significant boost in speed is reshaping the user experience in AI applications, facilitating real-time iterations and enhancing intelligence through advanced agentic computation. About The Role We are looking for an exceptionally talented Deployment Engineer to design and manage our state-of-the-art inference clusters. In this role, you will have the opportunity to work with the unparalleled Wafer-Scale Engine (WSE) and the systems that exploit its extraordinary capabilities.

Feb 17, 2026

Apply

Principal Engineer, AI Inference Reliability

Cerebras Systems

Full-time|Remote|Remote Office; Sunnyvale CA or Toronto Canada

Cerebras Systems is at the forefront of AI innovation, manufacturing the largest AI chip in the world, which is 56 times bigger than conventional GPUs. Our cutting-edge wafer-scale architecture provides the computational power equivalent to dozens of GPUs on a single chip, simplifying programming to the level of a single device. This pioneering approach enables us to offer unmatched training and inference speeds, allowing machine learning practitioners to smoothly execute large-scale ML applications without the complexity of managing numerous GPUs or TPUs. Our clientele includes leading model laboratories, major global corporations, and innovative AI-native startups. Notably, OpenAI has recently partnered with Cerebras to leverage 750 megawatts of scale, revolutionizing critical workloads with ultra-high-speed inference. Our advanced wafer-scale architecture makes Cerebras Inference the fastest Generative AI inference solution available, outperforming GPU-based hyperscale cloud inference services by over tenfold. This remarkable speed enhancement is reshaping the user experience of AI applications, enabling real-time iterations and enhanced intelligence through additional agentic computation.In late 2024, we launched Cerebras Inference, setting a new standard for Generative AI inference speed. Since its launch, we have rapidly scaled our services to meet the rising demand from AI labs, enterprises, and a vibrant developer community.In October 2025, we celebrated our Series G funding round, successfully raising $1.1 billion USD to accelerate the growth of our product offerings and services to satisfy global AI demand.About the TeamThe Cerebras Inference team is dedicated to delivering the most efficient, secure, and reliable enterprise-grade AI service. We design and manage expansive distributed systems that facilitate AI inference with unparalleled speed and efficiency. Join us in scaling our inference capabilities to new heights!

Feb 17, 2026

Apply

Strategic Finance Analyst at Coram AI | Sunnyvale

Coram AI

Full-time|On-site|Sunnyvale

At Coram AI, we are revolutionizing video security for an increasingly complex world. Our innovative cloud-native platform utilizes advanced computer vision and artificial intelligence to empower businesses to enhance safety, optimize decision-making, and accelerate operations. We offer real-time alerts, effortless clip sharing, and comprehensive multi-site visibility.Joining our agile and dynamic team means becoming part of a culture that prioritizes transparency, excellence, and substantial impact. Each team member has a voice, contributes meaningful work, and plays a crucial role in harnessing the power of AI to create a safer and more interconnected society.We are seeking a highly skilled Strategic Finance Analyst who will delve into our business operations and drive faster, smarter decision-making. This is not a traditional reporting position; rather, you will take ownership of the financial model, key performance indicator (KPI) framework, and performance insights, collaborating closely with the Head of Finance as a strategic partner. You will translate concepts and strategies into quantifiable metrics and bring your own insights to the forefront. If you thrive in ambiguous environments, desire true ownership, and prioritize meaningful impact over mere presentation, this role is tailored for you.

Feb 7, 2026

Apply

Embedded AI Engineer – Android Automotive (On-Device Intelligence)

Applied Intuition, Inc.

Full-time|On-site|Sunnyvale, California, United States

Join Our Team as an Embedded AI Engineer at Applied Intuition!At Applied Intuition, we're at the forefront of advancing physical AI technology. Founded in 2017, our Silicon Valley-based company, valued at $15 billion, is developing the essential digital infrastructure to infuse intelligence into every moving machine worldwide. Our solutions serve diverse industries including automotive, defense, trucking, construction, mining, and agriculture across three main areas: tools and infrastructure, operating systems, and autonomy. Trusted by eighteen of the top twenty global automakers, as well as the United States military and allies, we are committed to delivering groundbreaking physical intelligence solutions. Our headquarters is located in Sunnyvale, California, with additional offices in Washington, D.C.; San Diego; Ft. Walton Beach, Florida; Ann Arbor, Michigan; London; Stuttgart; Munich; Stockholm; Bangalore; Seoul; and Tokyo. Discover more at applied.co.As an in-office company, we expect our employees to work from the Applied Intuition office five days a week. However, we value flexibility and trust our employees to manage their schedules responsibly. This might include occasional remote work, starting the day with morning meetings from home, or leaving early to accommodate family commitments.About the RoleWe are developing cutting-edge on-device intelligence for a next-generation Android Automotive platform. In this role, you will oversee the complete lifecycle of embedded machine learning systems, ensuring that models operate reliably and safely in production environments while managing real-world constraints such as latency, thermal limits, and functional safety.Your ResponsibilitiesDeploy and manage production-quality ML inference and learning systems on Android Automotive (AAOS).Implement on-device multimodal large language models (LLMs), including schema design and safe integration with local vehicle APIs.Integrate models using frameworks like TensorFlow Lite, ONNX Runtime, or specialized vendor SDKs.Profile and optimize models to meet strict requirements for latency, memory usage, power consumption, and thermal management.

Apr 8, 2026

Apply

Director of Autonomous Driving Technology & AI Development

Bosch Group

Full-time|On-site|Sunnyvale

Join Bosch Group as the Director of Autonomous Driving Technology & AI Development, where you will lead cutting-edge initiatives in the rapidly evolving field of autonomous driving. In this pivotal role, you will oversee the development and implementation of innovative AI solutions that enhance vehicle performance and safety.As a leader in this domain, you will collaborate with cross-functional teams, guide research and development efforts, and drive strategic partnerships to elevate Bosch's position in the market. Your expertise will play a crucial role in shaping the future of mobility.

Mar 25, 2026

Apply

Product Manager - Platform, Applied AI/ML

Scopely

Full-time|$107.5K/yr - $165K/yr|On-site|US - Sunnyvale, United States

Are you eager to leverage Data and AI to creatively address fundamental game design challenges, captivating players today while shaping the future of mobile gaming? Join Scopely, as part of the Niantic Games team, as a Product Manager in our Applied AI division. In this role, you will spearhead, innovate, and develop exceptional AI-driven experiences across our product portfolio, including Pokémon GO, Monster Hunter Now, and Pikmin Bloom. In contrast to a conventional platform position, you'll collaborate closely with Game Designers, Producers, and ML Engineers to implement AI functionalities that directly enhance player experience—from tailored game loops to smart, adaptive gameplay.

Feb 20, 2026

Apply

Software Engineer, Inference AI/ML

CoreWeave

On-site|On-site| Sunnyvale, CA / Bellevue, WA

Join CoreWeave as a Software Engineer on our Inference team, where you'll play a vital role in enhancing the performance of our AI model serving platform. As an entry-level engineer, you will implement impactful features that improve latency, reliability, and cost-efficiency on our cutting-edge GPU-based infrastructure. This role offers a unique opportunity for hands-on learning and professional growth through mentorship from seasoned engineers.

Feb 10, 2026

Create account — see all 684 results