companyBetsson Group logo

Site Reliability Engineer - Gaming

On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Mid to Senior

Qualifications

We are seeking candidates with solid support and troubleshooting capabilities, proficient in most of the following technologies:Observability & Monitoring: Extensive experience in building dashboards and tracking SLAs/SLOs using tools such as Prometheus, Grafana, Coralogix, Splunk, or Loki. Programming & Automation: Strong scripting and coding skills to automate manual tasks (reducing 'toil') and create reliability tools. Proficiency in . NET, Python, PowerShell, or Bash is highly advantageous. Infrastructure as Code (IaC) & Cloud: Experience with provisioning and managing infrastructure using Terraform or Ansible, coupled with a solid understanding of cloud platforms (AWS, GCP, or Azure). Containerization & Orchestration: Hands-on experience in scaling and managing distributed systems using Kubernetes (K8s) and Docker. CI/CD & Change Management: Familiarity with Continuous Integration and Continuous Deployment practices to ensure smooth transitions during code updates.

About the job

Betsson Group seeks a Site Reliability Engineer in Athens to support the stability and performance of online casino services. This position is part of the global Product Development organization, which spans six tech hubs: Malta, Budapest, Stockholm, Tallinn, Kyiv, and Athens. Nearly 600 professionals collaborate across these locations under the guidance of the CTO-CPO.

Role overview

This role focuses on maintaining and improving the reliability of Betsson Group’s gaming platforms. The Site Reliability Engineer works closely with teams across multiple locations to ensure services run smoothly for users worldwide.

What you will do

  • Incident and Problem Management: Investigate system incidents, lead Root Cause Analysis (RCA), and implement long-term solutions. Proactively reduce incidents related to system changes.
  • Observability and Metrics: Define and uphold Service Level Agreements (SLAs), Service Level Objectives (SLOs), and success metrics for new projects. Build and maintain dashboards to support observability.
  • Performance and Capacity: Identify and resolve performance bottlenecks. Optimize infrastructure and code for efficient service delivery, and contribute to capacity planning for hardware or cloud resources.
  • Availability and Change Management: Ensure platform components remain available and functional. Oversee deployments to minimize disruption to live systems.

About Betsson Group

Betsson Group is a leader in the online gaming industry, renowned for its innovative approach and commitment to providing an exceptional gaming experience. Our diverse portfolio of products includes a variety of online casino games, sports betting, and live gaming, catered to players around the world.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.