About the job
At Crusoe, our mission is to accelerate the abundance of energy and intelligence. We are building the infrastructure that empowers individuals to use AI creatively without compromising on scale, speed, or sustainability.
Join us in leading the AI revolution with innovative technology at Crusoe. As part of our team, you'll contribute to significant advancements, effect real change, and be at the forefront of responsible and transformative cloud infrastructure.
About This Role:
The Crusoe Cloud Software Development team is on the lookout for an enthusiastic and seasoned Senior Staff Software Engineer who specializes in Hypervisor Virtualization and Research. This key position is vital for designing, developing, and optimizing our virtualization technologies, specifically designed for an AI-centric cloud infrastructure. A profound understanding of hypervisor internals, CPU and memory virtualization, I/O virtualization, and performance optimization is crucial for creating reliable, high-performance, and secure virtualized environments that will support our pioneering AI products. This is a full-time opportunity.
What You’ll Be Working On:
Hypervisor Development & Optimization: Design, develop, and optimize essential hypervisor components (e.g., KVM, QEMU, or bespoke solutions) to maximize performance and efficiency for AI workloads, concentrating on CPU, memory, and I/O virtualization techniques.
Virtualization Research & Innovation: Engage in comprehensive research on advanced virtualization technologies, investigating innovative methods for isolating and accelerating AI computing, storage, and networking resources. Identify and prototype new virtualization features and enhancements to elevate density, throughput, and latency.
Virtual Hardware & Device Emulation: Create and refine virtual hardware components and device emulation, ensuring peak performance and compatibility for specialized AI accelerators (e.g., GPUs, DPUs) within the virtualized ecosystem.
Performance Analysis & Tuning: Assess and enhance the performance of the entire virtualization stack, from the hypervisor to the virtualized guest OS, with a focus on optimization for AI/ML workloads, including profiling, bottleneck detection, and implementing low-level enhancements.
System-Level Troubleshooting: Identify and resolve intricate system issues within the virtualization layer, collaborating closely with hardware and guest OS teams to debug and tackle integration challenges.

