About the job
Gimlet Labs is pioneering the creation of the first heterogeneous neocloud specifically designed for AI workloads. As the demand for AI systems continues to grow, the industry is encountering significant challenges related to power, capacity, and cost with current homogeneous, vertically integrated infrastructures. Gimlet tackles these limitations by decoupling AI workloads from their physical hardware, intelligently partitioning tasks into components and orchestrating them to the most suitable hardware, optimizing for performance and efficiency. This innovative approach facilitates the use of heterogeneous systems across multiple vendors and generations of hardware, including the latest cutting-edge accelerators, enabling substantial improvements in performance and cost efficiency at scale.
Furthermore, Gimlet is developing a robust production-grade neocloud tailored for agentic workloads. Our customers benefit from deploying and managing their workloads seamlessly through stable, production-ready APIs, alleviating the need to focus on hardware selection, placement, or intricate performance optimizations.
Collaborating with foundation labs, hyperscalers, and AI-native organizations, Gimlet powers real-world production workloads capable of scaling to gigawatt-class AI datacenters.
Gimlet Labs is on the lookout for a Technical Staff Intern to assist in the development of our platform dedicated to deploying and monitoring AI workloads. In this role, you will leverage the latest AI methodologies to create frameworks that enhance and optimize AI workloads. You will play a vital role in advancing Gimlet’s unique compilation framework, facilitating the partitioning and orchestration of AI workloads across varied hardware environments. Your designs will lead to scalable systems capable of handling production workloads of millions of requests per second.

