About the job
Join Our Team
Adaptive ML is an innovative AI startup at the forefront of technology, specializing in the development of a Reinforcement Learning Operations (RLOps) platform. We empower enterprises to tailor and implement large language models (LLMs) into production, delivering measurable impact.
Our platform provides essential infrastructure for tuning, evaluating, and serving specialized models at scale, leading the charge in task-specific LLM development and managing production-ready workflows that handle millions of requests while optimizing performance and cost across distributed systems.
Our dedicated team has a rich background in creating pioneering open-access large language models. With a $20M seed investment secured from Index Ventures and ICONIQ in early 2024, we are already operational with esteemed clients such as Manulife, AT&T, and Deloitte, particularly in the travel and financial sectors , with more exciting developments on the horizon.
As a Technical Staff Member, you will play a critical role in developing the foundational technologies that drive Adaptive ML, closely collaborating with our Commercial and Product teams to meet their needs. We focus on building resilient, efficient technology and conducting impactful research at scale to guide our roadmap and enhance the value we deliver to our clients.
Role Overview
This position is an open role within our Technical Staff. If you find any of the following responsibilities resonate with your skills, we encourage you to apply!
As a member of the Technical Staff, your primary focus will be on enhancing our internal LLM Stack, Adaptive Harmony. We approach generative AI through a “big science” lens, merging large-scale engineering with thorough empirical research. We prioritize scalability and systematic empirical validation. We seek driven, business-oriented, and ambitious candidates eager to facilitate the real-world deployment of a highly technical product. Given the early-stage nature of this role, you will have the chance to influence our research initiatives and product development as we expand.
Please note: This position requires in-person attendance at our offices located in Paris or New York.
Key Responsibilities:
Develop robust software using Rust, creating interfaces between user-friendly Python recipes and high-performance distributed training code operating on hundreds of GPUs.
Profile and optimize GPU inference kernels utilizing Triton or CUDA, identifying memory bottlenecks and enhancing latency, along with establishing effective benchmarking for inference services.
