About the job
Your Role:
As the Engineering Manager for Site Reliability (SRE) at Moveworks, you will merge software and systems engineering to create and maintain large-scale, distributed, and fault-tolerant systems. Join us as a pivotal member of our SRE team in Bengaluru, where you will be instrumental in architecting and overseeing Moveworks' AI cloud infrastructure and strategy.
In a rapidly growing environment, you will design and manage resilient and secure cloud infrastructure, enabling our products to operate reliably and allowing our engineering teams to rapidly build and release customer-facing features.
You will collaborate with teams across platform, infrastructure, machine learning, search, data, DevOps, and frontend, building systems that empower these teams to deliver high-quality software promptly. This may involve enhancing CI/CD pipelines, enabling blue/green deployments, creating and managing canary environments, and reducing the risk of faulty code reaching production.
- Enhance the observability and reliability of Moveworks systems by developing and managing monitoring and alerting infrastructure.
- Improve debuggability by creating systems that facilitate issue resolution in production and analyze performance.
- Architect, design, and lead projects aimed at bolstering the reliability of our applications and systems.
- Serve as a technical leader for adjacent teams based in Bengaluru.

