Software Engineer - Kernel Reliability

Cerebras SystemsSunnyvale CA or Toronto Canada

On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.

Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Entry Level

Qualifications

ResponsibilitiesContribute to the technical roadmap and execution for kernel-centric reliability of our internal and customer-facing systems. Collaborate with System and Cluster Operations teams to minimize system and service downtime post-failure through tooling, analysis, and hands-on debugging support. Enhance debug tools in partnership with the Debug Team to expedite failure analysis.... (additional responsibilities)

About the job

Our clientele includes leading model labs, global corporations, and pioneering AI startups. Recently, OpenAI announced a multi-year collaboration with Cerebras, aiming to deploy 750 megawatts of power, significantly enhancing their workloads with ultra-fast inference capabilities.

With our groundbreaking wafer-scale architecture, Cerebras Inference provides the fastest Generative AI inference solution globally, outperforming GPU-based hyperscale cloud services by over tenfold. This remarkable speed enhancement is transforming user experiences in AI applications, facilitating real-time iterations and amplifying intelligence through advanced computational capabilities.

About The Role

We are in search of a highly technical and hands-on Software Engineer to join our Kernel Reliability team. In this pivotal role, you will address the crucial task of enhancing the reliability of our advanced compute clusters, along with the inference, training, and internal production services. You will work closely with the code to develop solutions that scale alongside our rapidly evolving production systems and software services. If you possess strong foundations in systems, debugging, and failure analysis and have a passion for creating tools and solving complex reliability challenges, we would love to connect with you. New graduates are encouraged to apply.

About Cerebras Systems

Cerebras Systems is at the forefront of AI technology, known for developing the world's largest AI chip. Our innovative approach and cutting-edge technology enable unparalleled computational power, making us a leader in the AI industry. We serve a variety of clients, from top-tier model labs to dynamic startups, and are committed to advancing AI capabilities globally.

Similar jobs

1 - 20 of 521 Jobs

Search for Kernel Engineer

521 results

Select all on this page (20)

Apply

Kernel Engineer

Cerebras Systems

Full-time|On-site|Sunnyvale CA or Toronto Canada

Cerebras Systems is revolutionizing artificial intelligence with the world's largest AI chip, 56 times larger than traditional GPUs. Our innovative wafer-scale architecture delivers unparalleled AI compute power, equating to dozens of GPUs on a single chip, all while maintaining the programming simplicity of a single device. This unique solution enables Cerebras to achieve unmatched training and inference speeds, allowing machine learning practitioners to seamlessly execute large-scale ML applications without the complexities of managing multiple GPUs or TPUs.We proudly serve a diverse clientele that includes leading model labs, multinational corporations, and pioneering AI-native startups. Notably, OpenAI has recently entered into a multi-year partnership with Cerebras, harnessing 750 megawatts of scale to transform critical workloads with ultra-high-speed inference.Our cutting-edge wafer-scale architecture powers the fastest Generative AI inference solution globally, boasting speeds over ten times faster than GPU-based hyperscale cloud inference services. This remarkable acceleration is reshaping the user experience of AI applications, facilitating real-time iterations and enhancing intelligence through advanced agentic computation.About The RoleAs a Kernel Engineer, you will be pivotal in crafting high-performance software solutions at the convergence of hardware and software. Your primary responsibility will be to implement, optimize, and scale deep learning operations that fully utilize our custom, massively parallel processor architecture.You will collaborate with a world-class team focused on designing, tuning for performance, and validating foundational ML and HPC kernels. This role includes building a comprehensive library of parallel and distributed algorithms aimed at maximizing compute utilization and enhancing training efficiency for state-of-the-art AI models. Your contributions will be crucial in unlocking the full capabilities of our hardware and accelerating the advancements in AI.

Feb 23, 2026

Apply

Engineering Manager, Kernel Reliability

Cerebras Systems

Full-time|On-site|Sunnyvale CA or Toronto Canada

Cerebras Systems is at the forefront of AI technology, having developed the world's largest AI chip, which is 56 times larger than traditional GPUs. Our innovative wafer-scale architecture delivers the AI computing power equivalent to dozens of GPUs on a single chip, simplifying programming to a single device. This revolutionary design enables Cerebras to provide unmatched training and inference speeds, empowering machine learning practitioners to seamlessly execute large-scale ML applications without the complexities of managing multiple GPUs or TPUs.Our clientele includes elite model labs, global corporations, and pioneering AI-native startups. Notably, OpenAI recently entered into a multi-year partnership with Cerebras to deploy 750 megawatts of scale, significantly enhancing key workloads with ultra high-speed inference.Thanks to our groundbreaking wafer-scale architecture, Cerebras Inference offers the fastest Generative AI inference solution globally, achieving speeds over 10 times faster than GPU-based hyperscale cloud inference services. This substantial speed boost is transforming user experiences in AI applications by enabling real-time iterations and enhancing intelligence through additional agentic computation.The RoleWe are seeking a highly technical and hands-on Engineering Manager to lead our on-field Kernel Reliability team. You will guide a high-performing team in addressing a critical challenge: enhancing the reliability of our advanced compute clusters along with the associated inference, training, and internal production services. In this influential role, you will define the technical vision while remaining closely engaged with the code, crafting scalable solutions for our rapidly expanding system production and software service offerings. If you possess proven expertise in software or hardware reliability, diagnostic tool development, or failure analysis and debugging, we invite you to connect with us.ResponsibilitiesProvide hands-on technical leadership, owning the technical vision and roadmap for kernel-centric reliability concerning both internal and customer-facing systems.

Feb 17, 2026

Apply

Software Engineer - Kernel Reliability

Cerebras Systems

Full-time|On-site|Sunnyvale CA or Toronto Canada

Cerebras Systems is revolutionizing the AI landscape with the world's largest AI chip, which is 56 times larger than traditional GPUs. Our innovative wafer-scale architecture delivers the computational power of multiple GPUs on a single chip, simplifying programming and enabling unparalleled training and inference speeds. This technology allows our users to run extensive machine learning applications seamlessly, eliminating the complexities associated with managing numerous GPUs or TPUs.Our clientele includes leading model labs, global corporations, and pioneering AI startups. Recently, OpenAI announced a multi-year collaboration with Cerebras, aiming to deploy 750 megawatts of power, significantly enhancing their workloads with ultra-fast inference capabilities.With our groundbreaking wafer-scale architecture, Cerebras Inference provides the fastest Generative AI inference solution globally, outperforming GPU-based hyperscale cloud services by over tenfold. This remarkable speed enhancement is transforming user experiences in AI applications, facilitating real-time iterations and amplifying intelligence through advanced computational capabilities.About The RoleWe are in search of a highly technical and hands-on Software Engineer to join our Kernel Reliability team. In this pivotal role, you will address the crucial task of enhancing the reliability of our advanced compute clusters, along with the inference, training, and internal production services. You will work closely with the code to develop solutions that scale alongside our rapidly evolving production systems and software services. If you possess strong foundations in systems, debugging, and failure analysis and have a passion for creating tools and solving complex reliability challenges, we would love to connect with you. New graduates are encouraged to apply.

Mar 5, 2026

Apply

Software Engineer - OS & Kernel for Robot Software

Wayve

Full-time|On-site|Sunnyvale

At Wayve, we are dedicated to fostering a diverse, equitable, and respectful culture that values the unique skills and perspectives of every individual, regardless of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital status, sexual orientation, gender identity, veteran status, pregnancy or related conditions (including breastfeeding), or any other protected status under applicable law.About UsFounded in 2017, Wayve stands at the forefront of Embodied AI technology. Our cutting-edge AI software and foundational models empower vehicles to perceive, comprehend, and navigate complex environments, significantly enhancing the usability and safety of automated driving systems.We envision a future where autonomy drives progress. Our intelligent, mapless, and hardware-agnostic AI solutions are crafted for automotive manufacturers, expediting the transition from assisted driving to full automation. In our dynamic environment, we thrive on tackling significant challenges, embracing uncertainty to unlock innovative solutions. We aim high while remaining humble in our quest for excellence, continually learning and adapting as we forge a smarter, safer future.Your contributions at Wayve are valued. We celebrate diversity, welcome fresh perspectives, and cultivate an inclusive workplace where we support one another in making a meaningful impact.Join Wayve and let us shape the defining experience of your career!The RoleThe Robot Software team is responsible for the software that powers our internal fleet of vehicles, enabling autonomous driving and data collection for training new driving models. You will collaborate with a motivated and skilled team of engineers to deliver a reliable, stable, and flexible software stack that assists on-road experimentation by our model developers and scientists. Your efforts will empower these teams to iterate rapidly and gather the essential data needed to enhance our autonomous driving capabilities and support new product features, which are vital to Wayve’s mission.The OS & Kernel team within Robot Software curates Wayve's custom Linux distribution, which operates across our growing development fleet. The team collaborates closely with various divisions within Wayve, including other sections of Robot Software, hardware and supply chain teams, as well as our field engineering and reliability engineering teams. Responsibilities include developing and maintaining our Linux distribution utilizing Yocto, creating and updating Linux kernels, and ensuring optimal system performance.

Feb 17, 2026

Apply

Software Infrastructure Engineer

Cylake Inc.

Full-time|$150K/yr - $250K/yr|On-site|Sunnyvale

Your ContributionBecome an integral part of a dynamic team dedicated to developing the next generation of cybersecurity solutions from the ground up. Work alongside industry experts with a proven history of innovation as you design, construct, and launch groundbreaking products that will make a significant impact in the field. This role offers you the chance to enhance your career and skills as part of a world-class organization from the very outset.Job ResponsibilitiesYou will play a pivotal role in architecting and implementing the platform layer, from the Bootloader to system software, for a large-scale embedded system. This encompasses image and software lifecycle management, including packaging, upgrades, high availability, and telemetry/debug infrastructure. You will have the chance to design and implement this system from the ground up.

Mar 5, 2026

Apply

Senior RFIC Design Engineer - Silicon Engineering

SpaceX

Full-time|On-site|Sunnyvale, CA

Join SpaceX as a Senior RFIC Design Engineer in our Silicon Engineering team. In this pivotal role, you will be responsible for designing innovative RF integrated circuits that drive our next-generation space technologies. Collaborate with a team of experts to push the boundaries of technology while ensuring the highest standards of quality and performance.

Apr 1, 2026

Apply

Senior RTL Design Engineer in Silicon Engineering

SpaceX

Full-time|$170K/yr - $235K/yr|On-site|Sunnyvale, CA

Founded on the vision of making humanity a multi-planetary species, SpaceX is pioneering the technologies to enable human life on Mars. Our mission extends beyond the stars; we are also transforming global connectivity through Starlink, the most advanced broadband internet system in the world.SENIOR RTL DESIGN ENGINEER (SILICON ENGINEERING)At SpaceX, we leverage our extensive experience in rocket and spacecraft development to successfully deploy Starlink, the world's largest satellite constellation. This initiative is providing high-speed, reliable internet to millions across the globe. We design, build, test, and operate all components of the system, from thousands of satellites to consumer receivers that enable users to connect with ease, and the software that integrates it all. As we continue to expand Starlink's global reach, we seek exceptional engineers to enhance its utility for communities and businesses worldwide.We are in search of a proactive and intellectually curious Senior RTL Design Engineer to collaborate with our elite, cross-disciplinary teams, including systems, firmware, architecture, design, validation, product engineering, and ASIC implementation. In this role, you will be at the forefront of developing next-generation FPGAs and ASICs that will be deployed in both space and terrestrial infrastructures. Your contributions will facilitate connectivity in areas that have previously lacked affordable and reliable access, thereby enhancing the capabilities of the Starlink network.

Mar 10, 2026

Apply

RFIC Design Engineer - Silicon Engineering

Space Exploration Technologies Corp.

Full-time|$130K/yr - $180K/yr|On-site|Sunnyvale, CA

At SpaceX, we believe in a future where humanity explores the stars, and we are dedicated to developing the technologies to make that vision a reality. Our ultimate goal is to enable human life on Mars.RFIC DESIGN ENGINEER (SILICON ENGINEERING)As part of SpaceX, you will leverage our extensive experience in building rockets and spacecraft to support the deployment of Starlink, the world's most advanced broadband internet system. Starlink features the largest satellite constellation globally, providing fast and reliable internet to millions of users. We design, build, test, and operate all system components, including thousands of satellites and user-friendly consumer receivers that allow connectivity within minutes. Join us in maximizing Starlink's global impact for communities and businesses.We are seeking a highly motivated and proactive RF/Analog IC engineer to collaborate with world-class cross-disciplinary teams, including systems, firmware, architecture, design, validation, and product engineering. In this position, you will develop cutting-edge next-generation RFICs for deployment in both space and terrestrial infrastructures. These innovations will enable connectivity in previously unreachable, unaffordable, or unreliable areas. Help us deliver next-generation solutions to enhance the Starlink network's performance and capabilities.

Mar 2, 2026

Apply

Software Engineer - AI Engineering

Applied Intuition

Full-time|On-site|Sunnyvale, California, United States

Join Applied Intuition as a Software Engineer specializing in AI Engineering, where you'll have the opportunity to work on cutting-edge technology and contribute to innovative projects that shape the future of artificial intelligence. As part of our dynamic team, you will collaborate with talented professionals to design, develop, and implement AI solutions that address real-world challenges.

Mar 25, 2026

Apply

Mechanical HVAC Commissioning Engineer

LotusWorks

Full-time|On-site|Sunnyvale, California, United States

LotusWorks is a premier Engineering Services provider, excelling in the management of Commissioning, Construction Services, Calibration, Operations & Maintenance across global manufacturing facilities. With operations spanning EMEA and North America, we collaborate with industry leaders in the Semiconductor, Pharmaceutical & Biologics, Medical Device, and Data Centre sectors. Our Engineering and Technical professionals are at the forefront of cutting-edge technologies and innovations. At LotusWorks, we are dedicated to fostering a diverse and inclusive workplace that is central to our people-first philosophy.We are on the lookout for a skilled Mechanical Commissioning Engineer to spearhead and execute the commissioning processes for HVAC mechanical systems across intricate facility projects. The ideal candidate will ensure that all mechanical systems are installed, tested, and function in alignment with design specifications and ASHRAE commissioning protocols. This position necessitates robust technical expertise, exceptional documentation capabilities, and the ability to liaise effectively with various project stakeholders.

Apr 13, 2026

Apply

Manufacturing Engineer

Intuitive Surgical, Inc.

Full-time|On-site|Sunnyvale

Position OverviewJoin our dynamic manufacturing engineering team at Intuitive Surgical, where your mechanical engineering design and analysis expertise will play a crucial role in creating cutting-edge equipment used for testing, measuring, and ensuring the quality of our innovative instruments for minimally invasive robotic surgery. Your project management skills will be invaluable as you collaborate with our software engineering team to enhance electro-mechanical components, assemblies, process documentation, tooling, and testing methods, driving improvements in efficacy, reliability, manufacturability, and cost.Key ResponsibilitiesDevelop, maintain, and optimize high-volume manufacturing assembly lines, refining BOMs, workflow processes, and detailed work instructions.Design, document, procure, qualify, implement, and enhance fixtures, tools, and equipment that include both mechanical and electronic components, as well as control algorithms and programming.Analyze equipment performance and trends, addressing root causes of issues to guarantee equipment accuracy, reliability, repeatability, and reproducibility.Modify equipment drawings and implement updates for large-scale production lines across various plants.Conduct risk analysis (PFMEA) on the instrument-manufacturing line, validating critical tests utilized in manufacturing.Oversee the validation and qualification of manufacturing processes and equipment, employing standard qualification methodologies (IQ/OQ/DQ/PQ).Support production lines by resolving day-to-day engineering challenges related to core instruments, including emergency repairs.Identify and execute continuous improvement initiatives focused on first pass yield, cycle time reduction, product reliability, capacity enhancement, and cost reduction.Provide DFx (Design for Manufacturing and Assembly) insights to improve the manufacturability of core instruments.Ensure compliance with medical device quality systems, including the closure of corrective actions and implementation of ECOs.Conduct failure analysis for production discrepancies and field returns, offering technical support as necessary.Willingness to travel periodically to suppliers or our Intuitive Mexicali plant.Exhibit leadership through knowledge sharing, mentoring, and training others.Proactively seek and implement improved technologies for production processes.Enhance documentation related to equipment installation, repair, and maintenance.

Feb 23, 2026

Apply

SOC Design Verification Engineer - Silicon Engineering

Space Exploration Technologies Corp

Full-time|$135K/yr - $185K/yr|On-site|Sunnyvale, CA

At SpaceX, we believe in a future where humanity explores the stars. We are actively developing groundbreaking technologies to enable human life on Mars.SOC DESIGN VERIFICATION ENGINEER (SILICON ENGINEERING)Join us in leveraging our extensive experience in rocket and spacecraft development to deploy Starlink, the world’s most advanced broadband internet system. As the largest satellite constellation, Starlink provides fast and reliable internet to millions globally. We design, build, test, and operate every aspect of the system, including thousands of satellites and consumer receivers that allow users to connect swiftly. We are looking for elite engineers to harness Starlink’s potential for communities and businesses worldwide.We are on the lookout for a motivated, proactive, and intellectually curious engineer to collaborate with our world-class cross-disciplinary teams (systems, firmware, architecture, design, validation, product engineering, ASIC implementation). You will be developing next-generation ASICs for deployment in space and ground infrastructures across the globe. These chips will enhance connectivity in areas where it has not been available, affordable, or reliable. Your contributions will be pivotal in delivering innovative solutions that expand the capabilities of the Starlink network.

Mar 10, 2026

Apply

DevOps Engineer

Collabera

Full-time|On-site|Sunnyvale

Join our dynamic team at Collabera as a DevOps Engineer. In this pivotal role, you will be responsible for optimizing and automating our software deployment processes, ensuring seamless integration and delivery of applications. We are looking for an innovative thinker who thrives in a fast-paced environment and is passionate about cloud technologies.

Oct 31, 2015

Apply

Hardware Engineer

Wayve

Full-time|On-site|Sunnyvale

Join Wayve as a Hardware Engineer and be at the forefront of driving innovative technology solutions. In this role, you will engage in the design, development, and testing of cutting-edge hardware systems. We are looking for passionate engineers who thrive in a collaborative environment and are eager to push the boundaries of hardware engineering.

Mar 30, 2026

Apply

DevOps Engineer

360 IT Professionals

Full-time|On-site|Sunnyvale

Join our dynamic team at 360 IT Professionals as a DevOps Engineer. In this role, you will collaborate with software developers and IT staff to oversee code releases and manage complex infrastructure systems efficiently. You will play a crucial part in automating processes and ensuring a seamless deployment of applications and services.

Feb 26, 2016

Apply

Senior Mechanical Engineer

Intuitive Surgical, Inc.

Full-time|On-site|Sunnyvale

Join our innovative team at Intuitive Surgical, Inc. as a Senior Mechanical Engineer, where your expertise will drive the development of cutting-edge robotic surgical systems. You will be instrumental in designing, analyzing, and testing mechanical components that enhance patient outcomes and surgical precision.

Mar 24, 2026

Apply

Director of Engineering

Illumio

Full-time|On-site|Sunnyvale, California - HQ

Join Us in Shaping the Future of Cybersecurity!At Illumio, we are at the forefront of ransomware and breach containment, pioneering innovative methods for organizations to manage cyberattacks and enhance operational resilience. Our advanced breach containment platform, powered by the Illumio AI Security Graph, detects and mitigates threats across hybrid multi-cloud environments, preventing cyber incidents from escalating into major crises.As a recognized leader in the Forrester Wave™ for Microsegmentation, Illumio is dedicated to enabling Zero Trust Security, significantly bolstering cyber resilience for the infrastructure and systems essential to global operations.Our Vision:Our Engineering team is committed to revolutionizing the landscape of cybersecurity. We value visionary leadership, independence, and accountability, cultivating a culture of innovation that propels us forward in an ever-changing environment. As leaders in Zero Trust Segmentation, we are redefining security in a world rife with unprecedented cyber threats. You will work with a highly scalable SaaS service developed using cloud-native technologies while also delivering solutions on-premises.Our engineering philosophy emphasizes precision and discipline. We focus on doing things right, avoiding shortcuts, and most importantly, enjoying the process. We empower ownership at every level of the organization. If you thrive in this type of environment, we would love to have you on our team!Your Role:In this pivotal role, you will blend hands-on technical contributions with managerial oversight, steering key product and technology initiatives that carry significant visibility and responsibility.Lead the team dedicated to developing platform services, defining and executing vital engineering initiatives, and overseeing the integration of applications with infrastructure.Collaborate with leaders across engineering, operations, product management, and other departments to make decisions critical to features, product launches, and platform evolution.Ensure the accuracy, scalability, and modularity of platform architecture and implementation, while adhering to regulatory standards and security best practices.Develop and nurture a high-performing team that delivers these capabilities, collaborating with senior technical leads and cross-functional leaders.

Feb 14, 2025

Apply

Application Engineer - Postsales

Applied Intuition, Inc.

Full-time|$150K/yr - $165K/yr|On-site|Sunnyvale, California, United States

About Applied IntuitionApplied Intuition, Inc. is at the forefront of revolutionizing physical AI technologies. Established in 2017 and currently valued at $15 billion, this Silicon Valley powerhouse is building the essential digital infrastructure required to infuse intelligence into every moving machine across the globe. Serving key sectors such as automotive, defense, trucking, construction, mining, and agriculture, Applied Intuition focuses on three main areas: tools and infrastructure, operating systems, and autonomous technologies. Our solutions are trusted by 18 of the world's top 20 automakers, along with the United States military and its allies, to drive physical intelligence forward. Headquartered in Sunnyvale, California, we also have offices in Washington, D.C.; San Diego; Ft. Walton Beach, Florida; Ann Arbor, Michigan; London; Stuttgart; Munich; Stockholm; Bangalore; Seoul; and Tokyo. Discover more at applied.co.At Applied Intuition, we prioritize in-office collaboration, expecting employees to work primarily from our office five days a week. We also value flexibility and trust our employees to responsibly manage their schedules, which may include occasional remote work, starting the day with morning meetings from home before heading into the office, or adjusting hours as needed for family commitments.About the RoleWe are seeking talented engineers at all experience levels who are passionate about assisting our customers in resolving intricate technical challenges related to Applied's products. Our Postsales Engineers possess a distinctive blend of software expertise, domain knowledge, and a results-oriented mindset. By leveraging these skills, they collaborate closely with autonomy developers to ensure that Applied's products deliver maximum value in advancing our customers' autonomy goals. This role is pivotal within the company, bridging our product and engineering expertise with our customer-centric culture.Key Responsibilities:Conduct office hours and provide first-line support for customersOversee and manage requests from customer teams to ensure they derive maximum value from Applied's solutionsHandle bug reports and feature requestsFacilitate Technical Program Management (TPM) related to customer issues

Jan 14, 2026

Apply

Staff Value Engineer

Intuitive Surgical, Inc.

Full-time|On-site|Sunnyvale

Join Intuitive Surgical as a Staff Value Engineer, where you'll have the opportunity to shape the future of minimally invasive surgery. Our team is dedicated to advancing surgical technology and improving patient outcomes. In this role, you will leverage your engineering expertise to analyze and optimize the value of our surgical systems.

Jan 30, 2026

Apply

Staff Supplier Engineer

Intuitive Surgical, Inc.

Full-time|On-site|Sunnyvale

We are seeking a talented and motivated Staff Supplier Engineer to join our dynamic team at Intuitive Surgical, Inc. In this role, you will play a crucial part in managing supplier relationships and ensuring the highest quality of materials and components for our innovative surgical systems. You will be responsible for evaluating suppliers, conducting audits, and collaborating closely with cross-functional teams to drive continuous improvement.

Jan 20, 2026

Create account — see all 521 results