Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.
Unlock Your Potential
Generate Job-Optimized Resume
One Click And Our AI Optimizes Your Resume to Match The Job Description.
Is Your Resume Optimized For This Role?
Find Out If You're Highlighting The Right Skills And Fix What's Missing
Experience Level
Senior
Qualifications
Proven experience in site reliability engineering or related fields. Strong expertise in cloud platforms (AWS, Azure, or GCP). Proficiency in scripting languages such as Python, Go, or Shell. Experience with container orchestration tools (Kubernetes, Docker). In-depth knowledge of CI/CD practices and tools. Excellent problem-solving skills and a proactive mindset. Strong communication and collaboration abilities.
About the job
Join Saviynt as a Senior Site Reliability Engineer, where you will play a crucial role in ensuring the reliability, availability, and performance of our innovative solutions. You will collaborate with cross-functional teams to implement and maintain scalable infrastructure, automate deployments, and develop monitoring systems that enhance our operational resilience.
About Saviynt
Saviynt is a leading provider of cloud-based identity governance solutions, committed to delivering innovative services that empower organizations to manage their digital identities securely and efficiently. Our culture fosters creativity, collaboration, and continuous learning.
Similar jobs
1 - 20 of 2,096 Jobs
Search for Site Reliability Engineer Platform Engineering
About CodeRabbitCodeRabbit is at the forefront of innovation in research and development, dedicated to creating remarkably efficient systems for human-machine collaboration. Our mission is to redefine software development through the integration of cutting-edge Gen AI-driven code review technology, fostering a powerful synergy between human creativity and sophisticated algorithms that surpass the capabilities of individual engineers. By merging advanced language models with human insight, we aim to enhance the efficiency and quality of software development.The RoleWe are looking for a skilled Site Reliability Engineer to enhance our Platform Engineering team in Bengaluru. In this pivotal role, you will be key in ensuring the exceptional availability, performance, and scalability of CodeRabbit's AI-driven code review platform. This position bridges software engineering and systems operations, where you will construct the essential platforms and automation that empower our engineering teams to deploy, monitor, and scale our services reliably.As a Site Reliability Engineer at CodeRabbit, your responsibilities will include fortifying the scale, reliability, and security of our critical services that manage millions of code reviews, developing sophisticated automation, and overseeing the infrastructure that powers our AI-driven analysis engine. You will work with state-of-the-art technologies, including large language models and distributed architectures designed for significant scale.
Why Choose Headout?We’re a Rocketship: 9-Figure Revenue, Record Growth, and ProfitableWith an impressive revenue of $130M and a presence in over 100 cities, Headout has achieved 18 months of profitability, making it the fastest-growing marketplace in the travel industry. We’ve secured over $60M from leading investors and are committed to building a sustainable business for the long term. Our growth story is just beginning!Our Mission MattersIn today's digital age, enhancing our human experiences is crucial. At Headout, we aim to provide the easiest, quickest, and most enjoyable way to explore real-life experiences—from immersive tours to museums and live events, we cover it all.Why Join Us Now?With a solid foundation and tremendous potential ahead, this is an exciting time to join Headout. Having reached profitability and gained momentum, we have only just begun to build. If you're seeking a role where your contributions will make a significant impact, now is the perfect time to join our team!Our CultureRevolutionizing the travel industry is challenging but incredibly rewarding. We value ownership, craftsmanship, and impact, and we're dedicated to doing the best work of our careers. If you're a builder who thrives on solving complex problems, you'll fit right in. Discover more about our unique values here. The RoleAs a Senior Site Reliability Engineer, you will oversee infrastructure management, working with Kubernetes clusters in the cloud, and optimizing workloads. Your responsibilities will include managing CI/CD pipelines, developing reusable workflows using GitHub Actions (or similar tools), conducting canary releases, and enhancing observability. You will design service-level dashboards, fine-tune alerts, and handle incident management across the organization. Additionally, you will enhance application performance through backend changes to optimize API and page performance, improve database efficiency, and eliminate bottlenecks. You will also contribute to platform tools by architecting scalable and efficient platforms for cross-pod use cases and improve developer velocity by building tools and workflows that enhance efficiency across engineering teams. Security responsibilities will include establishing guardrails on...
Join Saviynt as a Senior Site Reliability Engineer, where you will play a crucial role in ensuring the reliability, availability, and performance of our innovative solutions. You will collaborate with cross-functional teams to implement and maintain scalable infrastructure, automate deployments, and develop monitoring systems that enhance our operational resilience.
Join Saviynt as a Site Reliability Engineer and play a pivotal role in ensuring the reliability and performance of our cloud services. You will collaborate with development and operations teams to build and maintain scalable systems while implementing monitoring solutions to enhance system resilience. This position offers you the opportunity to contribute to a dynamic environment focused on continuous improvement and operational excellence.
Join Alpheya, a pioneering B2B WealthTech startup headquartered in Abu Dhabi, and supported by renowned financial institutions such as BNY Mellon and Lunate. With a remarkable fundraising achievement of $300 million, we are developing a cutting-edge wealth technology platform aimed at transforming the wealth management landscape.Our mission is to empower our clients’ wealth franchises with unique experiences, comprehensive financial solutions, and actionable insights. Our innovative digital wealth management platform is designed to help banks and financial institutions across the Middle East effectively engage and expand their reach within affluent, High Net Worth (HNW), and Ultra High Net Worth (UHNW) investor segments.As a startup, we embrace the agility and collaboration of cross-functional teams while leveraging the resources and expertise of established organizations.
Join our dynamic team at Ping Identity as a Site Reliability Engineer II, where you'll play a crucial role in maintaining and enhancing the reliability and performance of our services. You'll collaborate with talented engineers to implement scalable solutions that ensure optimal functionality and user experience.
Join Saviynt as a Principal Site Reliability Engineer and lead the charge in ensuring the reliability and performance of our systems. In this role, you will leverage your expertise in cloud infrastructure, automation, and monitoring to enhance our operational capabilities. Collaborate cross-functionally with development and operations teams to design and implement solutions that optimize system performance and availability.
Join Saviynt as a Staff Site Reliability Engineer, where you will play a critical role in enhancing the reliability and performance of our systems. You will work collaboratively with cross-functional teams to ensure seamless operations and create innovative solutions that drive efficiency.
6sense builds technology to help organizations grow, retain customers, and work more efficiently. The company encourages teams and individuals to reach their potential by supporting their goals with thoughtful tools and systems. Core values at 6sense include working as one team, staying curious, doing the right thing, owning outcomes, and creating a sense of belonging. Team members are expected to show initiative, act with integrity, and focus on delivering real value to customers. Meeting challenges directly and influencing the direction of the company’s technology are encouraged. Role overview The Site Reliability Engineering Manager, based in Bengaluru, leads a team dedicated to the scalability, reliability, and performance of 6sense’s main infrastructure and customer-facing services. This position blends technical leadership with operational oversight and people management. What you will do Mentor and guide a team of SREs who maintain and enhance critical systems Establish the direction for reliability engineering and define best practices Oversee system availability and performance, working to minimize downtime Encourage a culture of proactive problem-solving and continuous improvement Collaborate with engineering, product, and security teams to design systems that can scale with customer growth Impact and collaboration This leadership role is highly visible within 6sense. The SRE Manager helps shape how infrastructure supports both rapid company expansion and evolving customer needs.
About AlphaSense: AlphaSense is the trusted partner for the world's leading companies, providing cutting-edge market intelligence that removes uncertainty from decision-making. Our platform leverages advanced AI to deliver critical insights from a vast array of trusted content, including equity research, company filings, event transcripts, expert calls, news, and trade journals.The recent acquisition of Tegus by AlphaSense in 2024 enhances our mission to empower professionals with AI-driven market insights. This collaboration will drive growth, innovation, and content expansion, enabling users to discover deeper insights from extensive content sets. Trusted by over 6,000 enterprise clients, including a significant portion of the S&P 500, AlphaSense was founded in 2011 and is headquartered in New York City, with a global workforce exceeding 2,000 employees across offices in the U.S., U.K., Finland, India, Singapore, Canada, and Ireland. Join us in shaping the future!About The Role:As we expand our Site Reliability Engineering (SRE) team, we are seeking an accomplished Staff Site Reliability Engineer to drive the future of reliability, scalability, and performance at AlphaSense. This high-impact, hands-on role will involve architecting core reliability platforms, leading incident responses, and fostering the adoption of SRE best practices throughout our global engineering organization.Your mission will be to configure our platform to meet the reliability standards of mission-critical systems, aiming for 99.99% uptime while continuously improving our systems and processes. This role transcends traditional system maintenance; it focuses on pioneering the platforms, practices, and culture that facilitate effective engineering scaling. You will serve as a mentor to fellow engineers, influence architectural decisions, and establish the technical standards for reliability throughout the organization.
Your Role: As the Engineering Manager for Site Reliability (SRE) at Moveworks, you will merge software and systems engineering to create and maintain large-scale, distributed, and fault-tolerant systems. Join us as a pivotal member of our SRE team in Bengaluru, where you will be instrumental in architecting and overseeing Moveworks' AI cloud infrastructure and strategy. In a rapidly growing environment, you will design and manage resilient and secure cloud infrastructure, enabling our products to operate reliably and allowing our engineering teams to rapidly build and release customer-facing features. You will collaborate with teams across platform, infrastructure, machine learning, search, data, DevOps, and frontend, building systems that empower these teams to deliver high-quality software promptly. This may involve enhancing CI/CD pipelines, enabling blue/green deployments, creating and managing canary environments, and reducing the risk of faulty code reaching production. Enhance the observability and reliability of Moveworks systems by developing and managing monitoring and alerting infrastructure. Improve debuggability by creating systems that facilitate issue resolution in production and analyze performance. Architect, design, and lead projects aimed at bolstering the reliability of our applications and systems. Serve as a technical leader for adjacent teams based in Bengaluru.
ABOUT TEIKAMETRICSTeikametrics is leading the transformation of retail with our innovative Artificial Retail Intelligence platform. Our unique orchestration layer elevates AI capabilities tailored for major marketplaces including Amazon, Walmart, TikTok, and other emerging platforms. Discover more at www.teikametrics.com. ROLE OVERVIEWWe are seeking a talented Site Reliability Engineer to join our team in Bengaluru, India. You will play a crucial role in constructing and maintaining our cloud infrastructure that supports Teikametrics applications and platforms. Additionally, you will contribute to the development of internal DevOps tools and establish best practices to enhance software development and deployment efficiency. This highly impactful position involves guiding deployments utilizing cutting-edge technologies such as Docker, Kubernetes, and Terraform, significantly influencing our organization as a whole. In alignment with a DevOps model, you will collaborate with product development teams to design, deploy, and manage automation tools that enhance predictability, optimize efficiency, and lower operational costs. Note: Preference will be given to candidates based in Bangalore. TEAM DYNAMICSOur team manages services and infrastructure across AWS in conjunction with third-party providers, tackling the security and scalability challenges that arise. Our daily responsibilities encompass:Managing and scaling web applications and data platforms.Developing tools to implement DevOps and security best practices.Creating reusable and immutable infrastructure using Terraform.Continuously enhancing our infrastructure with monitoring, logging, and alerting solutions.Designing authentication and gateway solutions for our infrastructure and applications.Diagnosing and resolving application issues, advising development teams on design, deployment, and infrastructure decisions.Participating in on-call rotations, conducting post-mortems, and performing root cause analysis (RCA).
About Zuora Zuora helps businesses adapt and grow by powering modern models like subscriptions, usage-based pricing, and AI-enabled services. The Zuora platform supports launching new products, automating billing, and driving recurring revenue. With more than ten years at the forefront of the Subscription Economy, Zuora is evolving its platform to deliver a comprehensive quote-to-cash solution. This foundation is built for flexibility and AI-readiness, enabling companies to monetize their offerings efficiently. Role Overview: Senior Site Reliability Engineer Zuora is hiring a Senior Site Reliability Engineer in Bengaluru, Karnataka. This role drives reliability strategy and advances automation across AI-powered systems. The engineer will take on complex systems, influence architecture, and work closely with teams across the company. Key Responsibilities Define and improve SLOs, SLIs, and resilience patterns for critical services Develop automation using AI for detection, remediation, and forecasting Lead initiatives related to cloud infrastructure and Kubernetes platforms Enhance incident response processes and promote operational excellence Mentor engineers and help shape reliability practices across the organization Required Qualifications 8+ years in Site Reliability Engineering, DevOps, or large-scale production operations Deep experience with AWS, including EC2, EKS, VPC, IAM, RDS, S3, and CloudWatch Strong skills in Infrastructure-as-Code using Terraform (complex modules, state management, governance) Advanced programming and automation abilities in Python and Shell, with a history of building robust automation for production Expertise in Linux systems: performance tuning, security hardening, and advanced troubleshooting Hands-on experience with distributed systems and data streaming platforms such as Kafka, especially in high-throughput environments Comfort working independently on complex, ambiguous problems that have broad impact Demonstrated technical leadership on large-scale reliability or infrastructure projects, including setting direction, influencing design, and mentoring engineers for measurable results Location This position is based in Bengaluru, Karnataka, India.
About AlphaSense: AlphaSense is the trusted partner for leading companies across the globe, providing them with the intelligence needed to make informed decisions. Harnessing cutting-edge AI technology, AlphaSense aggregates valuable insights from a plethora of trustworthy sources, including equity research, company filings, event transcripts, expert calls, news articles, trade journals, and proprietary client research.In 2024, AlphaSense's acquisition of Tegus marks a significant step towards enhancing our mission of empowering professionals with AI-driven market intelligence. This strategic collaboration aims to innovate and expand content capabilities, enabling our users to discover even deeper insights from extensive datasets. With a clientele that includes over 6,000 enterprise customers, including a majority of the S&P 500, AlphaSense was founded in 2011 and is headquartered in New York City, employing over 2,000 professionals across various global offices in the U.S., U.K., Finland, India, Singapore, Canada, and Ireland. Join our dynamic team!About The Role:We are expanding our Site Reliability Engineering (SRE) team and seeking a highly skilled Staff Site Reliability Engineer. In this pivotal role, you will be instrumental in shaping the future of reliability, scalability, and performance at AlphaSense. This hands-on position involves designing and implementing core reliability platforms, leading incident response efforts, and fostering a culture of SRE best practices throughout our global engineering teams.Your mission will be to develop our platform to the rigorous reliability standards of mission-critical systems, targeting an ambitious 99.99% uptime while continuously refining our systems and processes. This role transcends traditional maintenance; you will be at the forefront of pioneering platforms, practices, and a culture that facilitates effective engineering scalability. You will serve as a mentor to fellow engineers, influence architectural decisions, and set the benchmark for reliability across the organization.
Roku’s Platform Infrastructure team supports the systems behind one of the largest TV streaming platforms, serving over 100 million users and enabling billions in annual transactions. The team’s work underpins the reliability and performance of Roku’s services across the U.S., Canada, and Mexico. Role overview The Senior Software Engineer - Site Reliability Engineering position is based in Bengaluru, India. This role centers on applying SRE principles to maintain and improve cloud infrastructure and drive automation across Roku’s platform. The team works with technologies such as Kubernetes, Istio, Envoy, and various observability tools to operate at internet scale. What you will do Design and develop large-scale, reliable systems that support Roku’s streaming platform Apply SRE best practices to enhance system reliability and efficiency Automate infrastructure management and operational tasks Collaborate with engineers across departments to deliver solutions that impact the entire company Requirements Significant experience in Site Reliability Engineering or related software engineering roles Proven ability to design and build large-scale systems Hands-on experience with cloud infrastructure and automation Familiarity with technologies such as Kubernetes, Istio, Envoy, and observability tools Strong organizational skills, curiosity, and a drive to learn
Join our dynamic team of expert Hadoop engineers at Acceldata, where you will play a pivotal role in delivering top-notch support services in vendor-agnostic environments. As a Site Reliability Engineer, you will work closely with seasoned professionals, enhancing the availability, scalability, performance, and reliability of our innovative products and our clients' data lake environments.This position offers a unique opportunity to actively engage with customer feedback, demonstrating empathy and problem-solving skills to provide effective solutions. You will have the chance to develop your technical, business, and interpersonal skills in a collaborative and stimulating environment.
Join Valtech as a Site Reliability Engineer - Monitoring Specialist and be at the forefront of experience innovation. Here, we empower you to challenge the norm and explore uncharted territories in technology. With 6+ years of expertise, you will play a crucial role in shaping digital solutions that transform industries.Our workplace is designed for continuous learning and meaningful impact. You will collaborate with a dynamic team, develop cutting-edge customer experiences, and drive innovation.Why Choose Valtech?We are the experience innovation company and a trusted partner for the world’s leading brands. We offer growth opportunities, a values-driven culture, and global career paths that allow you to shape the future of experience.
Welcome to OktaAt Okta, we are redefining the future of digital identity. As The World’s Identity Company, we empower individuals to securely access technology, anytime, anywhere, across any device or application. Our innovative solutions, including the Okta and Auth0 Platforms, provide robust access management, secure authentication, and automation, placing identity at the forefront of business security and growth.We value diverse perspectives and experiences, seeking lifelong learners who can enrich our team with their unique insights. Join us in our mission to create a world where identity is truly yours.Our Workforce Identity Cloud Security Engineering group is on the lookout for a Senior Staff Site Reliability Engineer with a strong passion for DevSecOps, Infrastructure Security, and Site Reliability Engineering (SRE). You will be a part of a pioneering team that is not only delivering exceptional solutions but also setting new benchmarks in cloud security. If you possess a solid background in safeguarding large-scale, mission-critical infrastructure, we want to connect with you.As a Senior Staff Site Reliability Engineer, you will be instrumental in designing and developing security solutions that fortify our cloud infrastructure. We foster a culture of innovation, encouraging you to advocate for defense-in-depth strategies, adhere to industry security standards, and implement the principle of least privilege to elevate our security posture.Our Infrastructure Security team is distinguished by its unique blend of security expertise and the ability to design, implement, and deploy infrastructure across various cloud environments without compromising product performance. We are dedicated to enhancing our customers' safety and privacy by integrating security services with core Okta products.This role is critical in a dynamic, security-focused organization poised for substantial growth. You will serve as a liaison between the Security and Engineering teams, leveraging technical expertise to influence the security roadmap and focus on engineering security aspects across our services. Join us in revolutionizing the industry and making a significant impact!
About UsAt The Economist Group (TEG), we are committed to fostering progress through innovation, independence, and analytical rigor. Our mission is to empower individuals and organizations to navigate the complex challenges and changes in the world. With our analytical expertise and evidence-based insights, we provide clarity and guidance to our clients and subscribers across 170 countries through our esteemed brands, including The Economist, Economist Impact, Economist Intelligence, and Economist Education.We are currently on the lookout for a dedicated and detail-oriented Site Reliability Engineer to join our expanding TechOps/SRE team. In this vital role, you will engage closely with Product, Engineering, and Software teams to enhance real-time visibility into our infrastructure, applications, and data systems.
Okta, Inc. helps organizations manage identity securely in a rapidly changing landscape. The technical operations team is dedicated to keeping systems available and resilient, with a strong focus on automation and reliability. This Site Reliability Engineering Manager position is based in Bengaluru. The role leads a team of SREs responsible for maintaining and improving Okta’s core infrastructure. Success in this position requires a hands-on leader who values automation, learns quickly, and is committed to both reliability and security. What you will do Mentor, manage, and guide a diverse team of SREs. Promote security best practices and drive projects that strengthen Okta’s infrastructure security. Respond to production incidents, resolve issues rapidly, and find ways to prevent future problems. Diagnose and troubleshoot complex production issues to maintain system reliability and performance. Collaborate with stakeholders across Okta to ensure new capabilities meet goals for reliability, security, and delivery speed. Work with recruiting and HR to help attract and retain top SRE talent. Monitor key metrics such as vulnerability scans, security posture, cloud costs, recovery point objectives (RPO), recovery time objectives (RTO), and toil overhead, making sure projects improve these measures. Support a 24/7 online environment as part of an on-call rotation. What sets you apart Proactive mindset: identify and resolve problems as they arise. Commitment to helping engineering peers grow, leading by example. Extensive experience managing teams in large-scale production environments, especially with Java/Tomcat and containerized services on AWS (such as EC2, ECS, KMS, Kinesis, RDS) or similar cloud platforms.
Apr 24, 2026
Sign in to browse more jobs
Create account — see all 2,096 results
Tailoring 0 resumes…
Tailoring 0 resumes…
We'll move completed jobs to Ready to Apply automatically.