Who You Are:
You are a dedicated and customer-oriented developer who thrives on enhancing user experiences through data-driven solutions. Your enthusiasm for diagnosing and resolving production challenges drives you to proactively identify and eliminate issues.
Your expertise encompasses cloud technologies, automation, infrastructure as code, networking, and microservices. You possess strong programming skills in Python, Java, or Go, with a keen desire to expand your knowledge further.
You demonstrate a solid understanding of software engineering principles throughout the software development lifecycle, including coding standards, code reviews, security protocols, source control, build processes, automated testing, deployment, monitoring, chaos engineering, and self-healing operations. You are proficient with tools and technologies such as CloudFormation, Terraform, New Relic, AWS Lambda, Serverless architecture, Elasticsearch, Docker, Kubernetes, Spark, Flink, Jenkins, GitHub, Artifactory, and Jira.
Your strong analytical and problem-solving skills, combined with effective verbal and written communication, enable you to lead production incident responses and postmortems successfully.
What to Expect as a Member of the SRE Team at WatchGuard:
The SRE team at WatchGuard is responsible for ensuring the reliability and security of our production cloud environments, collaborating closely with application development teams to deliver exceptional customer experiences. As you familiarize yourself with our systems, your responsibilities will include:
- Collaborating with development teams to ensure seamless production operations and managing large-scale event responses.
- Establishing operational and security policies, standards, and processes for development teams.
- Assisting development teams in defining, monitoring, and achieving their service level agreements through well-defined service level indicators and objectives.
A Typical Day in the Life of a Site Reliability Developer on the SRE Team at WatchGuard:
As a Site Reliability Developer at WatchGuard, your daily activities might involve:
- Working collaboratively with application teams in production environments across AWS, Azure, and hybrid cloud infrastructures to ensure effective monitoring, security, reliability, automation, and support.
- Promoting a culture of operational excellence through simplification, automation, analysis, and process evolution.
- Advocating for security and operational best practices to establish your reputation as a cloud expert among our diverse global development teams.
- Striving for the best possible customer experience, even during challenges, by actively participating in incident management and resolution efforts.