About the job
About Our Team
As part of the Fleet Scheduling team, our Full Stack Engineers are committed to creating innovative and scalable interfaces that empower researchers to effectively manage AI workloads across some of the largest supercomputing infrastructures globally. We focus on building robust, high-performance systems that deliver real-time insights, resource tracking, and seamless interactions with complex infrastructures. Our mission is to enhance resource allocation, reduce operational overhead, and develop user-friendly tools that boost researcher productivity and system transparency.
About the Role
In this exciting position, you will design, develop, and operate web-based systems that provide an intuitive interface to OpenAI’s supercomputing clusters. You will work closely with researchers, product teams, and infrastructure teams to deliver scalable solutions that facilitate seamless monitoring, job scheduling, and resource management. This is a unique opportunity to engage at the forefront of AI infrastructure, designing tools capable of scaling to exascale workloads while ensuring optimal usability and performance.
This role is based in San Francisco, CA, with a hybrid work model requiring 3 days in the office per week. We also offer relocation assistance to new employees.
In this role, you will:
Design and develop full-stack web applications for real-time tracking and management of large-scale AI workloads.
Collaborate with researchers and infrastructure teams to translate complex operational needs into intuitive user interfaces and scalable backend systems.
Create data visualization tools (e.g., Gantt charts, dashboards) to enhance insights into job scheduling and resource allocation.
Optimize backend services for high data throughput, ensuring low-latency performance and high availability.
Implement frontend components that enable smooth interactions with scheduling, storage, and compute systems.
Guarantee system security, reliability, and scalability across globally distributed supercomputing infrastructure.
You might excel in this role if you:
Have substantial experience in full-stack development, with proficiency in modern frontend frameworks (React, Vue, or Angular) and backend technologies (Python, Go, or Node.js).
Possess a track record of building scalable, high-performance web applications for complex systems.

