About the job
DigitalOcean is hiring a Hardware Sustaining Engineer in Denver. This role supports the backbone of our cloud platform: the hardware infrastructure that powers our global server fleet. The position reports to the Manager of Infra::Machines::Design.
Role Overview
The Hardware Sustaining Engineer focuses on optimizing and troubleshooting data center hardware at scale. The work centers on keeping our server hardware, cabling, and networking equipment reliable and efficient, so customers can stay focused on their priorities. This position is part of a collaborative team that addresses new challenges as DigitalOcean expands its data center capabilities and explores new technologies.
What You Will Do
- Work as a key member of the Sustaining Engineering team within the Infra::Machines::Design Organization.
- Oversee server hardware, cabling, and networking components throughout their lifecycle.
- Monitor the #machines channel and MACHINES JIRA project, responding to issues and driving them to resolution.
- Participate in a 24/7 on-call rotation with the team.
- Serve as Tier 2 escalation support for Datacenter Operations (DCOPS) and Cloud Operations (CloudOps) on hardware and firmware matters.
- Develop and maintain standards and practices for DigitalOcean hardware operations.
- Collaborate with teams such as Qualification, Firmware, Fleet Lifecycle Engineering (FLE), Foresight, and Infrastructure Services to solve tooling, firmware, hardware, and operational challenges.
- Support the creation of tooling and runbooks to improve operational processes for hardware and firmware management.
- Coordinate with Operations teams to define monitoring thresholds, failure modes, and alerting protocols.
- Troubleshoot root causes of failures and help implement preventive solutions.
- Promote higher standards in our cloud infrastructure by identifying and adopting best practices from the industry.

