About the job

At Thinking Machines Lab, we are dedicated to empowering humanity by advancing collaborative general intelligence. Our vision is to create a future where everyone can leverage AI to meet their unique needs and aspirations.

Our talented team comprises scientists, engineers, and innovators who have developed some of the most widely recognized AI products, including ChatGPT and Character.ai, alongside open-weight models like Mistral and popular open-source projects such as PyTorch, OpenAI Gym, Fairseq, and Segment Anything.

About the Position

We are seeking a motivated Infrastructure Research Engineer to design, enhance, and scale the systems that underpin large AI models. Your contributions will significantly improve inference speed, cost-effectiveness, reliability, and reproducibility, allowing our teams to concentrate on enhancing model capabilities rather than dealing with bottlenecks.

Our mission centers on delivering high-performance and efficient model inference to support real-world applications and accelerate research efforts. In this role, you will be responsible for the infrastructure that guarantees smooth operation for every experiment, evaluation, and deployment at scale.

Note: This is an evergreen role, kept open continuously to express interest. We receive numerous applications and may not always have an immediate opening that aligns perfectly with your skills and experience. However, we encourage you to apply. We regularly review applications and reach out to candidates as new opportunities arise. Feel free to reapply as you gain more experience, but we kindly ask that you avoid applying more than once every six months. You may also notice postings for specific roles related to particular projects or teams, in which case you are welcome to apply directly in addition to this evergreen role.

What You Will Do

Collaborate with researchers and engineers to transition cutting-edge AI models into production.
Partner with research teams to ensure high-performance inference for innovative architectures.
Design and implement new techniques, tools, and architectures that enhance performance, latency, throughput, and efficiency.
Optimize our codebase and computing resources (e.g., GPUs) to maximize hardware FLOPs, bandwidth, and memory usage.
Extend orchestration frameworks (e.g., Kubernetes, Ray, SLURM) for distributed inference, evaluation, and large-batch serving.
Establish standards for reliability, observability, and reproducibility throughout the inference stack.
Publish and share insights through internal documentation, open-source libraries, or technical reports that further the field of scalable AI infrastructure.

About the job

About the Position

What You Will Do

Collaborate with researchers and engineers to transition cutting-edge AI models into production.
Partner with research teams to ensure high-performance inference for innovative architectures.
Design and implement new techniques, tools, and architectures that enhance performance, latency, throughput, and efficiency.
Optimize our codebase and computing resources (e.g., GPUs) to maximize hardware FLOPs, bandwidth, and memory usage.
Extend orchestration frameworks (e.g., Kubernetes, Ray, SLURM) for distributed inference, evaluation, and large-batch serving.
Establish standards for reliability, observability, and reproducibility throughout the inference stack.
Publish and share insights through internal documentation, open-source libraries, or technical reports that further the field of scalable AI infrastructure.

About the Position

What You Will Do

Collaborate with researchers and engineers to transition cutting-edge AI models into production.

Partner with research teams to ensure high-performance inference for innovative architectures.

Design and implement new techniques, tools, and architectures that enhance performance, latency, throughput, and efficiency.

Optimize our codebase and computing resources (e.g., GPUs) to maximize hardware FLOPs, bandwidth, and memory usage.

Extend orchestration frameworks (e.g., Kubernetes, Ray, SLURM) for distributed inference, evaluation, and large-batch serving.

Establish standards for reliability, observability, and reproducibility throughout the inference stack.

Publish and share insights through internal documentation, open-source libraries, or technical reports that further the field of scalable AI infrastructure.

About the Position

What You Will Do

Collaborate with researchers and engineers to transition cutting-edge AI models into production.

Partner with research teams to ensure high-performance inference for innovative architectures.

Design and implement new techniques, tools, and architectures that enhance performance, latency, throughput, and efficiency.

Optimize our codebase and computing resources (e.g., GPUs) to maximize hardware FLOPs, bandwidth, and memory usage.

Extend orchestration frameworks (e.g., Kubernetes, Ray, SLURM) for distributed inference, evaluation, and large-batch serving.

Establish standards for reliability, observability, and reproducibility throughout the inference stack.

Publish and share insights through internal documentation, open-source libraries, or technical reports that further the field of scalable AI infrastructure.

Research Engineer, Infrastructure & Inference

Unlock Your Potential

Experience Level

Qualifications

About the job

About the Position

What You Will Do

About Thinking Machines Lab

Direct Appointment Setter at Southern National Roofing | Columbia, MD

Project Superintendent

Community Support Lead Care Manager at Pacific Health Group | Remote

Physical Therapist at Performance Optimal Health | New Canaan

Part-Time In-Home Veterinarian

Sales Support Specialist at Golden Lighting | Tallahassee, FL

New Home Sales Consultant at LGI Homes | Lebanon, TN

Medical Director - Licensed Psychiatrist

Recruiting Coordinator - Join Our Innovative Team

Experienced Litigation Paralegal - Remote

Senior Director of Digital Communications

Nutritional Cook for Early Childhood Center

FMS Analyst at ACT1 Federal | Patuxent River, MD

Automotive Technician Opportunity at Citrus Kia

Software Security Analyst at TP-Link Systems Inc. | Irvine, California

Network Intrusion Detection Engineer - Active TS/SCI with CI Poly

Tax Associate - Private Client

Lead Behavior Technician - Full-Time Position

Local Roofing Sales Representative - Roof Restoration Specialist

Senior Director of Inventory and Merchandise Planning

Research Engineer, Infrastructure & Inference

Unlock Your Potential

Experience Level

Qualifications

About the job

About the Position

What You Will Do

About Thinking Machines Lab

Research Engineer, Infrastructure & Inference

Unlock Your Potential

Experience Level

Qualifications

About the job

About the Position

What You Will Do

About Thinking Machines Lab

Research Engineer, Infrastructure & Inference

Unlock Your Potential

Experience Level

Qualifications

About the job

About the Position

What You Will Do

About Thinking Machines Lab