About the job
Discover Okta
At Okta, we are revolutionizing identity management. We empower users to securely access any technology from any device, ensuring seamless interaction with apps and tools. Our innovative Okta Platform and Auth0 Platform prioritize security, authentication, and automation, placing identity at the forefront of business growth and security.
We embrace diversity in our workforce, welcoming individuals who are lifelong learners and can enhance our culture with their unique perspectives. Join us in our mission to create a world where identity is in your hands.
About Our Team
Our Technical Operations (TechOps) team embodies the motto "Always On." We are committed to constructing and maintaining the most reliable, high-performance systems available, empowering organizations to achieve their most critical objectives through secure technology connections.
Your Role
We are seeking a dedicated Senior Site Reliability Engineer (SRE) who excels in managing large-scale cloud production environments. The ideal candidate is a proactive problem-solver who adheres to the principle: "If you have to do it twice, automate it." Based in the Washington, D.C. area with occasional on-site customer travel, you will play a critical role in ensuring our infrastructure is consistently reliable and meets the highest performance standards, supporting vital national security missions.
Security Requirement: Candidates must be able to obtain and maintain a U.S. security clearance (Secret or Top Secret) as mandated by U.S. Government contracts.
The selected candidate may be subject to drug testing as required by U.S. Government contracts.
Key Responsibilities
- Infrastructure Excellence: Architect, deploy, and oversee Okta’s production infrastructure to guarantee optimal performance and reliability.
- Incident Management: Act as a primary responder to production incidents, conducting thorough troubleshooting and implementing long-term preventive measures.
- Automation Focus: Minimize manual processes by developing automation scripts, enhancing monitoring tools, and documenting technical procedures.

