About the job
At Docker, we simplify application development, allowing developers to focus on their core objectives. Our remote-first team is globally dispersed, driven by a collective passion for innovation and exceptional developer experiences. With over 20 million monthly users and 20 billion image pulls, Docker stands as the premier tool for building, sharing, and operating applications—trusted by both startups and Fortune 100 companies. We are experiencing rapid growth and are just getting started. Join us for an exciting journey!
The Infrastructure Engineering team is responsible for building and managing the cloud-native platform that powers Docker’s product suite. We design resilient services, automate processes where beneficial, and measure key metrics to ensure hundreds of engineers can deploy safely to millions of users every day.
A key focus of our team is self-service. We develop streamlined platform capabilities that empower internal teams to provision, deploy, observe, and manage services with minimal friction and robust guardrails. We treat our platform as a product, establishing clear contracts, well-defined defaults, and comprehensive documentation. Our success is evaluated based on user adoption and a reduction in support requests.
How We Operate
- Documentation and Iteration: We emphasize thorough documentation, code reviews, and incremental releases.
- Sustainable Reliability: Our priority is to address root causes, establish effective alerts, and implement automation, rather than relying on heroics.
- Cross-Functional Collaboration: We work closely with product and security teams by default.
- AI-Driven Execution: We create workflows that reduce manual tasks and enhance incident response, while ensuring guardrails, auditability, and human review.
What You Will Focus On
- Minimizing manual work through automation, including AI-assisted operational workflows.
- Creating self-service onboarding and deployment workflows that reduce ticket volume and accelerate delivery timelines.
- Scaling Kubernetes foundations and evolving our traffic and ingress stack.
Key Responsibilities
1) Self-Service Platform Services
- Develop and manage internal platform services and APIs using Go, focusing on provisioning, quotas, policies, cost insights, and platform workflows.
- Establish streamlined pathways for self-service onboarding and ongoing operations, including access, deployment configurations, observability defaults, and governance frameworks.

