About the job
The Security Incident Response team is a crucial element of our Resilience Engineering organization, dedicated to safeguarding Datadog's integrity. Our mission is to ensure Datadog is well-equipped to manage and respond to security incidents swiftly, minimizing the impact of threats on our systems and data. We collaborate with various teams post-incident, using these experiences as learning opportunities. By enhancing our adaptability and addressing systemic issues, we foster a culture that prioritizes resilience among our personnel and systems.
As the Engineering Manager, you will play a pivotal role in achieving this mission by guiding a skilled team of engineers focused on elevating Datadog's incident response capabilities. You will develop tools and automation to enhance efficiency and collaborate with key stakeholders across the organization to align our efforts strategically and measure our progress. As a member of the leadership team, you will actively influence our organizational strategy and culture.
At Datadog, we value a vibrant office culture that fosters relationships and collaboration, encouraging creativity. We operate a hybrid workplace to ensure our Datadogs can achieve a work-life balance that suits their needs.
Key Responsibilities:
- Lead and mentor a team of seasoned incident responders passionate about cultivating a culture of security and resilience. Support engineers in their professional development and provide continuous growth opportunities.
- Act as a hands-on leader during incidents, making decisions under pressure and collaborating across multiple teams to drive resolutions. Participate in on-call duties as part of a secondary rotation with other leaders, providing assistance in resource allocation and decision-making.
- Oversee the team's triage of alerts and signals in Datadog Cloud SIEM, ensuring a high standard of response to emerging threats. Work alongside our Threat Detection team to optimize and calibrate these signals for maximum effectiveness.
- Develop tools, systems, and processes that enhance Datadog's security incident response capabilities, ensuring operational metrics are effectively communicated to stakeholders.
- Lead post-incident analysis to facilitate learning from security incidents, ensuring that postmortems are constructive and actionable. Capture follow-up items that address systemic issues and prevent future occurrences.

