Software Engineer Kernel Reliability jobs in Sunnyvale – Browse 622 openings on RoboApply Jobs

Software Engineer Kernel Reliability jobs in Sunnyvale

Open roles matching “Software Engineer Kernel Reliability” with location signals for Sunnyvale. 622 active listings on RoboApply Jobs.

622 jobs found

1 - 20 of 622 Jobs
Apply
Cerebras Systems logoCerebras Systems logo
Full-time|On-site|Sunnyvale CA or Toronto Canada

Cerebras Systems is revolutionizing the AI landscape with the world's largest AI chip, which is 56 times larger than traditional GPUs. Our innovative wafer-scale architecture delivers the computational power of multiple GPUs on a single chip, simplifying programming and enabling unparalleled training and inference speeds. This technology allows our users to …

Mar 5, 2026
Apply
Cerebras Systems logoCerebras Systems logo
Full-time|On-site|Sunnyvale CA or Toronto Canada

Cerebras Systems is at the forefront of AI technology, having developed the world's largest AI chip, which is 56 times larger than traditional GPUs. Our innovative wafer-scale architecture delivers the AI computing power equivalent to dozens of GPUs on a single chip, simplifying programming to a single device. This revolutionary design enables Cerebras to provide unmatched training and inference speeds, empowering machine learning practitioners to seamlessly execute large-scale ML applications without the complexities of managing multiple GPUs or TPUs.Our clientele includes elite model labs, global corporations, and pioneering AI-native startups. Notably, OpenAI recently entered into a multi-year partnership with Cerebras to deploy 750 megawatts of scale, significantly enhancing key workloads with ultra high-speed inference.Thanks to our groundbreaking wafer-scale architecture, Cerebras Inference offers the fastest Generative AI inference solution globally, achieving speeds over 10 times faster than GPU-based hyperscale cloud inference services. This substantial speed boost is transforming user experiences in AI applications by enabling real-time iterations and enhancing intelligence through additional agentic computation.The RoleWe are seeking a highly technical and hands-on Engineering Manager to lead our on-field Kernel Reliability team. You will guide a high-performing team in addressing a critical challenge: enhancing the reliability of our advanced compute clusters along with the associated inference, training, and internal production services. In this influential role, you will define the technical vision while remaining closely engaged with the code, crafting scalable solutions for our rapidly expanding system production and software service offerings. If you possess proven expertise in software or hardware reliability, diagnostic tool development, or failure analysis and debugging, we invite you to connect with us.ResponsibilitiesProvide hands-on technical leadership, owning the technical vision and roadmap for kernel-centric reliability concerning both internal and customer-facing systems.

Feb 17, 2026
Apply
Cerebras Systems logoCerebras Systems logo
Kernel Engineer

Cerebras Systems

Full-time|On-site|Sunnyvale CA or Toronto Canada

Cerebras Systems is revolutionizing artificial intelligence with the world's largest AI chip, 56 times larger than traditional GPUs. Our innovative wafer-scale architecture delivers unparalleled AI compute power, equating to dozens of GPUs on a single chip, all while maintaining the programming simplicity of a single device. This unique solution enables Cerebras to achieve unmatched training and inference speeds, allowing machine learning practitioners to seamlessly execute large-scale ML applications without the complexities of managing multiple GPUs or TPUs.We proudly serve a diverse clientele that includes leading model labs, multinational corporations, and pioneering AI-native startups. Notably, OpenAI has recently entered into a multi-year partnership with Cerebras, harnessing 750 megawatts of scale to transform critical workloads with ultra-high-speed inference.Our cutting-edge wafer-scale architecture powers the fastest Generative AI inference solution globally, boasting speeds over ten times faster than GPU-based hyperscale cloud inference services. This remarkable acceleration is reshaping the user experience of AI applications, facilitating real-time iterations and enhancing intelligence through advanced agentic computation.About The RoleAs a Kernel Engineer, you will be pivotal in crafting high-performance software solutions at the convergence of hardware and software. Your primary responsibility will be to implement, optimize, and scale deep learning operations that fully utilize our custom, massively parallel processor architecture.You will collaborate with a world-class team focused on designing, tuning for performance, and validating foundational ML and HPC kernels. This role includes building a comprehensive library of parallel and distributed algorithms aimed at maximizing compute utilization and enhancing training efficiency for state-of-the-art AI models. Your contributions will be crucial in unlocking the full capabilities of our hardware and accelerating the advancements in AI.

Feb 23, 2026
Apply
Wayve logoWayve logo
Full-time|On-site|Sunnyvale

At Wayve, we are dedicated to fostering a diverse, equitable, and respectful culture that values the unique skills and perspectives of every individual, regardless of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital status, sexual orientation, gender identity, veteran status, pregnancy or related conditions (including breastfeeding), or any other protected status under applicable law.About UsFounded in 2017, Wayve stands at the forefront of Embodied AI technology. Our cutting-edge AI software and foundational models empower vehicles to perceive, comprehend, and navigate complex environments, significantly enhancing the usability and safety of automated driving systems.We envision a future where autonomy drives progress. Our intelligent, mapless, and hardware-agnostic AI solutions are crafted for automotive manufacturers, expediting the transition from assisted driving to full automation. In our dynamic environment, we thrive on tackling significant challenges, embracing uncertainty to unlock innovative solutions. We aim high while remaining humble in our quest for excellence, continually learning and adapting as we forge a smarter, safer future.Your contributions at Wayve are valued. We celebrate diversity, welcome fresh perspectives, and cultivate an inclusive workplace where we support one another in making a meaningful impact.Join Wayve and let us shape the defining experience of your career!The RoleThe Robot Software team is responsible for the software that powers our internal fleet of vehicles, enabling autonomous driving and data collection for training new driving models. You will collaborate with a motivated and skilled team of engineers to deliver a reliable, stable, and flexible software stack that assists on-road experimentation by our model developers and scientists. Your efforts will empower these teams to iterate rapidly and gather the essential data needed to enhance our autonomous driving capabilities and support new product features, which are vital to Wayve’s mission.The OS & Kernel team within Robot Software curates Wayve's custom Linux distribution, which operates across our growing development fleet. The team collaborates closely with various divisions within Wayve, including other sections of Robot Software, hardware and supply chain teams, as well as our field engineering and reliability engineering teams. Responsibilities include developing and maintaining our Linux distribution utilizing Yocto, creating and updating Linux kernels, and ensuring optimal system performance.

Feb 17, 2026
Apply
Cylake Inc. logo
Full-time|$150K/yr - $250K/yr|On-site|Sunnyvale

Your ContributionBecome an integral part of a dynamic team dedicated to developing the next generation of cybersecurity solutions from the ground up. Work alongside industry experts with a proven history of innovation as you design, construct, and launch groundbreaking products that will make a significant impact in the field. This role offers you the chance to enhance your career and skills as part of a world-class organization from the very outset.Job ResponsibilitiesYou will play a pivotal role in architecting and implementing the platform layer, from the Bootloader to system software, for a large-scale embedded system. This encompasses image and software lifecycle management, including packaging, upgrades, high availability, and telemetry/debug infrastructure. You will have the chance to design and implement this system from the ground up.

Mar 5, 2026
Apply
Cerebras Systems logoCerebras Systems logo
Full-time|On-site|Sunnyvale, CA; Toronto, Ontario, Canada

Cerebras Systems is at the forefront of AI technology, developing the world’s largest AI chip that is 56 times larger than conventional GPUs. Our innovative wafer-scale architecture delivers the computational power of dozens of GPUs within a single chip, simplifying programming and enhancing performance. This unique capability enables Cerebras to provide unparalleled training and inference speeds, allowing machine learning practitioners to execute large-scale ML applications seamlessly without the complexities of managing extensive GPU or TPU infrastructures.Cerebras serves a diverse clientele, including top-tier model labs, global enterprises, and pioneering AI-native startups. OpenAI has recently partnered with Cerebras to leverage 750 megawatts of power, significantly enhancing key workloads through ultra high-speed inference.Our cutting-edge wafer-scale architecture has made Cerebras Inference the fastest Generative AI inference solution globally, achieving speeds over ten times faster than GPU-based hyperscale cloud inference services. This revolutionary speed is transforming the user experience of AI applications, facilitating real-time iteration and boosting intelligence through enhanced computational capabilities.About The RoleWe invite you to join Cerebras as a Performance & Reliability Engineer within our dynamic Co-Design and Next Generation Team. Our groundbreaking CS-3 system has established benchmarks for high-performance ML training and inference solutions, utilizing a chip the size of a dinner plate with 44GB of on-chip memory that exceeds traditional hardware capabilities. In this role, you will focus on characterizing and optimizing the performance and reliability of state-of-the-art AI models operating on Cerebras' revolutionary hardware.ResponsibilitiesCharacterize and enhance the performance and reliability of advanced ML hardware/software systems, focusing on minimizing power and thermal fluctuations.Analyze ML workloads, software kernels, and hardware architecture for their power and performance impacts, synthesizing high-level insights across these layers.Develop innovative software solutions to enhance system performance and efficiency.

Feb 17, 2026
Apply
Applied Intuition logoApplied Intuition logo
Full-time|On-site|Sunnyvale, California, United States

As a Fleet Reliability Engineer at Applied Intuition, you will be at the forefront of ensuring the reliability and performance of our advanced fleet systems. Your expertise will play a crucial role in the development and deployment of our cutting-edge technology, optimizing fleet operations to guarantee safety and efficiency.

Mar 25, 2026
Apply
Illumio logoIllumio logo
Full-time|On-site|Sunnyvale, California - HQ

Illumio’s Senior Site Reliability Engineer role is based at the company’s Sunnyvale, California headquarters. This is an on-site position, requiring presence in the office five days a week. Role overview This position focuses on building and maintaining reliable, scalable infrastructure for Illumio’s applications and services, with an emphasis on Azure cloud solutions. The Senior SRE supports both SaaS and on-premises offerings, working closely with engineering teams to ensure operational resilience and security across hybrid environments. What you will do Design, deploy, and maintain highly available infrastructure on Azure for Illumio’s products. Automate provisioning and configuration management using Infrastructure as Code tools such as Terraform or ARM templates. Develop and manage CI/CD pipelines to improve software delivery and deployment processes. Monitor system and application health using Azure monitoring and logging tools, and optimize for performance and availability. Lead incident response, perform root cause analysis, and document findings to drive continuous improvement. Collaborate with development teams to design scalable, reliable architectures and provide guidance on cloud-native best practices. Engineering at Illumio The engineering team values autonomy, ownership, and collaboration. Work centers on advancing cybersecurity with scalable SaaS services and solutions for on-premises environments. The team emphasizes disciplined engineering, quality, and a supportive culture.

Apr 22, 2026
Apply
Illumio logoIllumio logo
Full-time|On-site|Sunnyvale, California - HQ

Join Us on Our Mission!At Illumio, we are pioneering the way organizations combat ransomware and data breaches. Our innovative breach containment platform, driven by the Illumio AI Security Graph, enables businesses to effectively identify and mitigate threats across hybrid multi-cloud environments, preventing attacks from escalating into severe crises.As a recognized leader in the Forrester Wave™ for Microsegmentation, Illumio's solutions empower organizations to adopt Zero Trust models, enhancing cyber resilience for the critical infrastructure that sustains the global economy.On-Site Work:This position requires 5 days a week on-site presence at our Sunnyvale, CA headquarters.Our Vision:Our Engineering team thrives on a culture of visionary leadership, autonomy, and ownership, fostering an innovative environment that propels us through the dynamic landscape of cybersecurity.By joining our team, you will contribute to the forefront of Zero Trust Segmentation, utilizing an advanced technology stack that encompasses diverse operating systems, distributed applications, and cutting-edge UI/visualization tools.Together, we are shaping the future of cybersecurity, committed to developing world-class products guided by diverse perspectives and a shared dedication to innovation amidst unprecedented cyber threats.Your Role:As a Site Reliability Engineer II, you will oversee our multi-cloud infrastructure on platforms such as Azure, AWS, and/or GCP. Your responsibilities will include designing new cloud services and applications, collaborating closely with Engineering, SRE/OPS, and Security teams to transition these projects from development to production.Daily tasks will involve enhancing the reliability and scalability of Illumio's SaaS products while driving continuous improvement initiatives.We seek candidates with a strong passion for cloud technology, automation, and collaboration, as well as a solid understanding of the Azure cloud platform and related DevOps practices.

Feb 7, 2026
Apply
Illumio logoIllumio logo
Full-time|On-site|Sunnyvale, California - HQ

Join Us in Securing the Future!At Illumio, we are pioneers in ransomware and breach containment, transforming how organizations defend against cyberattacks and fortifying operational resilience. Our innovative Illumio AI Security Graph powers a breach containment platform that swiftly identifies and neutralizes threats across hybrid multi-cloud environments, preventing minor issues from escalating into catastrophic events.As a recognized leader in the Forrester Wave™ for Microsegmentation, we enable Zero Trust, bolstering the cyber resilience of the infrastructures, systems, and organizations that keep the world functioning smoothly.Location: This role requires on-site presence in our Sunnyvale, CA headquarters five days a week.Vision of Our Team:Our Engineering team flourishes within a culture that champions visionary leadership, autonomy, and ownership. This dynamic synergy propels us forward in the constantly evolving realm of cybersecurity.As a member of our team, you will be at the forefront of Zero Trust Segmentation, working with an advanced technology stack that encompasses operating systems, distributed applications, and immersive UI/visualization tools.We're not just shaping the future of cybersecurity; we’re committed to developing world-class products led by diverse perspectives, backgrounds, and an unwavering commitment to innovation amidst unprecedented cybersecurity challenges.Your Role:As a Site Reliability Engineer II, you will oversee and optimize our multi-cloud infrastructure across Azure, AWS, and/or GCP. You will have the opportunity to design new services and applications in the cloud, guiding them from development to production while collaborating closely with Engineering, SRE/Operations, and Security teams.Your daily responsibilities will include enhancing the reliability and scalability of Illumio's SaaS offerings and spearheading continuous improvement initiatives.The ideal candidate is driven by a passion for cloud technology, automation, and collaboration, coupled with a solid foundation in Azure cloud platforms and relevant DevOps practices.Design, deploy, and maintain robust cloud infrastructure solutions on Azure, AWS, and/or GCP to support our applications and services.Implement Infrastructure as Code (IaC) principles using tools such as Terraform, ARM templates, or CloudFormation to automate provisioning and configuration management.Develop and maintain CI/CD pipelines for automated software delivery and deployment, utilizing tools like Azure DevOps, AWS CodePipeline, or Jenkins.Monitor system performance and availability, ensuring optimal operational efficiency.

Mar 23, 2026
Apply
Intuitive Surgical, Inc. logoIntuitive Surgical, Inc. logo
Full-time|On-site|Sunnyvale

Join our dynamic team as a Senior Site Reliability Engineer focused on AI/ML solutions. In this role, you will leverage your expertise to enhance the reliability, scalability, and performance of our cutting-edge AI-driven products. You will work collaboratively with cross-functional teams to design, implement, and maintain robust systems that support our mission to revolutionize surgical technology.

Dec 25, 2025
Apply
Ceribell logoCeribell logo
Full-time|$141K/yr - $190K/yr|On-site|Sunnyvale, CA

About CeribellCeribell is at the forefront of medical technology, dedicated to revolutionizing the diagnosis and management of patients with serious neurological conditions. Our innovative Ceribell System is a cutting-edge, point-of-care electroencephalography (EEG) platform that meets the critical needs of patients in acute care settings. Already in use at hundreds of community hospitals, large academic institutions, and major integrated delivery networks across the nation, our team shares a collective mission to enhance critical care with our rapid seizure detection technology. Join us in making a difference!Position Overview:We are seeking a talented Senior Software Engineer with a strong backend focus to join our dynamic team in developing the next generation of EEG web applications that cater to vital medical use cases. In this role, you will be instrumental in designing, maintaining, and enhancing the backend systems for our EEG Portal web application, which is essential for healthcare providers, researchers, and clinical teams to access, monitor, and analyze EEG data. You will collaborate closely with fellow engineers, product managers, and stakeholders to ensure that our backend systems are robust, secure, and scalable within a medical environment.Key Responsibilities:Backend Development & Maintenance:Design, develop, and maintain backend systems to support the EEG Portal application, ensuring dependable performance and adherence to healthcare standards.Implement new features and enhancements to meet clinical and research demands, prioritizing efficiency and scalability.Troubleshoot, debug, and optimize backend systems to guarantee maximum uptime and reliability for users.Database Management:Write optimized database queries and execute data migration strategies.Monitor and fine-tune database performance, including indexing, replication, and backup processes.API Development & Integration:Develop and maintain RESTful APIs that interact with the frontend and other systems.Ensure APIs are secure, well-documented, and capable of handling large volumes of sensitive medical data.Integrate third-party services and platforms as needed to enhance functionality.Ensure backend services comply with regulatory standards, including data encryption, authentication, and auditing.

Mar 2, 2026
Apply
Wayve logoWayve logo
Full-time|On-site|Sunnyvale

Join Wayve as a skilled Application Software Engineer specializing in software integration and embedded systems. In this role, you will be integral in developing innovative software solutions that empower autonomous vehicles. You will work closely with cross-functional teams to design, implement, and optimize software that facilitates seamless integration of complex systems.

Mar 25, 2026
Apply
Wayve logoWayve logo
Full-time|On-site|Sunnyvale

At Wayve, we are dedicated to fostering a diverse, equitable, and inclusive culture that values each individual's unique skills and perspectives, irrespective of their sex, race, religion, ethnic or national origin, disability, age, citizenship, marital status, sexual orientation, gender identity, veteran status, pregnancy, or any other legally protected status.About UsFounded in 2017, Wayve is at the forefront of developing Embodied AI technology. Our cutting-edge AI software and foundation models empower vehicles to perceive, comprehend, and navigate complex environments, enhancing the safety and usability of automated driving systems.Our vision is to create autonomous solutions that move the world forward. Our intelligent, mapless, and hardware-agnostic AI products are tailored for automakers, driving the transition from assisted to fully automated driving. In our dynamic environment, we tackle significant challenges with enthusiasm, embracing complexity to unlock innovative solutions. We strive for excellence while remaining humble, consistently learning and evolving towards a smarter and safer future.At Wayve, your contributions are essential. We cherish diversity, welcome new insights, and promote an inclusive work atmosphere where we support one another to make a meaningful impact.Join Wayve and embark on a career-defining journey!The RoleAs a Software Engineer, you will engage with Wayve’s next-generation compute and sensor platform and contribute to all aspects of the software development lifecycle.As a key member of the Robot Software team, you will collaborate to develop software for edge devices that provide critical data and facilitate autonomy across a large fleet of vehicles. You will have a crucial role in ensuring that the software you develop operates reliably at scale, while also working closely with our Embodied AI and Science teams to provide them with the necessary data and interfaces for model training, experimentation, and performance feedback.

Feb 17, 2026
Apply
Intuitive Surgical, Inc. logoIntuitive Surgical, Inc. logo
Full-time|On-site|Sunnyvale

Intuitive Surgical, Inc. seeks a Senior Software Engineer to join the Platform Engineering team in Sunnyvale. This role centers on developing and maintaining the foundational software that powers advanced surgical technologies. Key responsibilities Design and build core platform software for surgical systems Collaborate with other engineering teams to create reliable and scalable solutions Drive ongoing enhancements that support improvements in surgical procedures and patient care Role focus This position emphasizes both architecture and hands-on development for the software platform. Work will directly impact the reliability and capabilities of surgical technologies used in healthcare settings.

Apr 24, 2026
Apply
DoorDash, Inc. logoDoorDash, Inc. logo
Full-time|On-site|San Francisco, CA; Sunnyvale, CA; Seattle, WA

Join DoorDash as a Staff Software Engineer specializing in Data Engineering, where you will play a critical role in designing and implementing data solutions that drive business insights and enhance operational efficiency. You will collaborate with cross-functional teams to create robust data pipelines and leverage cutting-edge technology to manage large-scale datasets.

Apr 30, 2026
Apply
Mindlance logoMindlance logo
Full-time|On-site|Sunnyvale

Join our innovative team at Mindlance as a Software Engineer. In this role, you will be instrumental in developing cutting-edge software solutions that enhance user experience and streamline processes. Collaborate with a talented team of engineers and contribute to various stages of software development, from concept to deployment.Your role will involve coding, testing, and debugging applications while ensuring optimal performance and responsiveness. You will have the opportunity to work with the latest technologies and tools in a dynamic environment that fosters growth and creativity.

Apr 28, 2015
Apply
DoorDash, Inc. logoDoorDash, Inc. logo
Full-time|On-site|San Francisco, CA; Sunnyvale, CA; Seattle, WA

Join our innovative team at DoorDash as a Senior Staff Software Engineer focused on Search. In this role, you'll play a key part in enhancing our search infrastructure, building scalable solutions, and driving impactful results that influence millions of users. You will collaborate with cross-functional teams to advance our technology stack and improve the overall user experience.

Apr 30, 2026
Apply
Intuitive Surgical logoIntuitive Surgical logo
Full-time|On-site|Sunnyvale

Primary Function of Position:Become part of Intuitive Surgical, a pioneering team committed to leveraging advanced technology to enhance patient outcomes through improved surgical precision and reduced invasiveness, with patient safety as our foremost concern.As a member of the Automation, Equipment and Test (AET) Team, you will contribute to creating the robotics that manufacture robotics. Your role will involve the design, development, and maintenance of equipment, fixtures, and tooling that optimize and enhance the manufacturing processes of surgical instruments and accessories.This position is pivotal in advancing the design and production of new surgical robotic systems and related instruments. You will develop software and algorithms for custom semi-automated electro-mechanical systems, ensuring product performance, reliability, and safety. Close collaboration with product development teams, systems analysts, electrical and mechanical engineers, manufacturing engineers, and quality engineers will be essential to establish a coherent diagnostic strategy and implement effective software solutions.Key Responsibilities:Design, develop, and implement software solutions for manufacturing equipment that constructs and tests medical devices, including robotic systems and accessories.Construct and sustain software infrastructures that facilitate value extraction from generated data.Analyze and refine manufacturing processes to boost efficiency, lower costs, and elevate productivity.Comprehend product operations and controls, and create methods to ensure their integrity during high-volume production.Document, direct, and execute IQOQPQ and DQ validation activities on manufacturing equipment.Establish, document, and adhere to best practices in software development.Independently navigate challenges with minimal supervision.Assume ownership of manufacturing software and collaborate with cross-functional teams to drive projects to completion.Support and upgrade existing production software.

Feb 24, 2026
Apply
Intuitive Surgical, Inc. logoIntuitive Surgical, Inc. logo
Robotics Software Engineer

Intuitive Surgical, Inc.

Full-time|On-site|Sunnyvale

Join our team as a Robotics Software Engineer at Intuitive Surgical, where we are dedicated to transforming minimally invasive care through innovative technologies. You will be instrumental in designing, developing, and testing cutting-edge software solutions that enhance robotic systems. Collaborate with multidisciplinary teams to ensure the seamless integration of software with hardware while maintaining the highest standards of quality.

Mar 10, 2026

Sign in to browse more jobs

Create account — see all 622 results

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.