companyOpenAI logo

Senior Software Engineer, Infrastructure - Analytics Platform

OpenAISan Francisco
Hybrid Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Senior

Qualifications

To be successful in this position, candidates should possess the following qualifications:Proficiency in Rust or C++ programming languages. Experience with distributed systems and performance optimization. Strong understanding of Kubernetes and container orchestration. Ability to work collaboratively within a team and effectively communicate technical concepts. Prior experience in a research-oriented environment is a plus.

About the job

About Our Team

The Scaling team at OpenAI is dedicated to designing, constructing, and managing essential infrastructure that powers groundbreaking research.

Our mission is straightforward: to expedite the advancement of research towards Artificial General Intelligence (AGI). We achieve this by developing foundational systems that researchers depend on, spanning from core infrastructure elements to specialized applications tailored for research. Our systems are designed to scale efficiently with the growing complexity and size of our workloads while ensuring reliability and user-friendliness.

About the Position

We are seeking a Senior Software Engineer to take charge of critical production infrastructure from start to finish.

This role primarily focuses on backend and systems engineering, with a strong emphasis on low-level performance, distributed systems, and the hands-on management of vital services at scale. You will be responsible for transforming ambiguous challenges into actionable plans, delivering pragmatic solutions promptly, and refining them based on real-world feedback and iterations.

This position goes beyond a standard Python backend role; we are specifically on the lookout for candidates with robust systems experience in Rust or C++, particularly in performance-sensitive infrastructure.

This is an in-office role based in San Francisco, CA, following a hybrid model of three days in the office per week. We also provide relocation assistance for new hires.

Your Responsibilities

  • Manage critical infrastructure throughout its lifecycle, including design, implementation, deployment, operation, and ongoing improvements.
  • Develop and maintain high-performance backend systems in Rust or C++ that facilitate core research operations.
  • Design and optimize distributed data and serving systems, considering partitioning, replication, consistency, retries, backpressure, and failure isolation.
  • Identify and resolve production bottlenecks related to latency, throughput, contention, hot spots, and overload scenarios.
  • Oversee mission-critical services, including on-call duties, incident management, postmortems, observability, deployment safety, and zero-downtime migrations.
  • Enhance the reliability of services running on Kubernetes, focusing on resource tuning and failure management.
  • Collaborate closely with engineers and researchers to deliver fast, dependable, and effective systems.
  • Elevate standards through strong technical judgment, ownership, and commitment to quality.

You Will Excel in This Role If You Have:

  • A proven track record of owning and delivering operationally critical systems end to end in ambiguous settings.
  • Experience with systems programming in Rust or C++.
  • Strong analytical skills and a problem-solving mindset.
  • Excellent communication and collaboration skills.

About OpenAI

OpenAI is at the forefront of research in artificial intelligence, dedicated to developing safe and beneficial AI for humanity. We strive to make advancements in AGI that can benefit society as a whole.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.