Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.
Unlock Your Potential
Generate Job-Optimized Resume
One Click And Our AI Optimizes Your Resume to Match The Job Description.
Is Your Resume Optimized For This Role?
Find Out If You're Highlighting The Right Skills And Fix What's Missing
Experience Level
Experience
Qualifications
Strong understanding of cloud infrastructure and services (AWS, Azure, or GCP). Proficiency in scripting and automation (Python, Bash, etc.). Experience with container orchestration tools (Docker, Kubernetes). Knowledge of monitoring tools (Prometheus, Grafana). Excellent problem-solving skills and a proactive mindset.
About the job
Join Saviynt as a Site Reliability Engineer and play a pivotal role in ensuring the reliability and performance of our cloud services. You will collaborate with development and operations teams to build and maintain scalable systems while implementing monitoring solutions to enhance system resilience. This position offers you the opportunity to contribute to a dynamic environment focused on continuous improvement and operational excellence.
About Saviynt
Saviynt is a leading provider of identity governance and administration solutions. We empower organizations to manage their identities securely and efficiently. With a commitment to innovation and excellence, Saviynt offers a dynamic work environment where employees are encouraged to grow and make impactful contributions.
Similar jobs
1 - 20 of 2,899 Jobs
Search for Senior Software Engineer Site Reliability Engineering
Roku’s Platform Infrastructure team supports the systems behind one of the largest TV streaming platforms, serving over 100 million users and enabling billions in annual transactions. The team’s work underpins the reliability and performance of Roku’s services across the U.S., Canada, and Mexico. Role overview The Senior Software Engineer - Site Reliability Engineering position is based in Bengaluru, India. This role centers on applying SRE principles to maintain and improve cloud infrastructure and drive automation across Roku’s platform. The team works with technologies such as Kubernetes, Istio, Envoy, and various observability tools to operate at internet scale. What you will do Design and develop large-scale, reliable systems that support Roku’s streaming platform Apply SRE best practices to enhance system reliability and efficiency Automate infrastructure management and operational tasks Collaborate with engineers across departments to deliver solutions that impact the entire company Requirements Significant experience in Site Reliability Engineering or related software engineering roles Proven ability to design and build large-scale systems Hands-on experience with cloud infrastructure and automation Familiarity with technologies such as Kubernetes, Istio, Envoy, and observability tools Strong organizational skills, curiosity, and a drive to learn
Why Choose Headout?We’re a Rocketship: 9-Figure Revenue, Record Growth, and ProfitableWith an impressive revenue of $130M and a presence in over 100 cities, Headout has achieved 18 months of profitability, making it the fastest-growing marketplace in the travel industry. We’ve secured over $60M from leading investors and are committed to building a sustainable business for the long term. Our growth story is just beginning!Our Mission MattersIn today's digital age, enhancing our human experiences is crucial. At Headout, we aim to provide the easiest, quickest, and most enjoyable way to explore real-life experiences—from immersive tours to museums and live events, we cover it all.Why Join Us Now?With a solid foundation and tremendous potential ahead, this is an exciting time to join Headout. Having reached profitability and gained momentum, we have only just begun to build. If you're seeking a role where your contributions will make a significant impact, now is the perfect time to join our team!Our CultureRevolutionizing the travel industry is challenging but incredibly rewarding. We value ownership, craftsmanship, and impact, and we're dedicated to doing the best work of our careers. If you're a builder who thrives on solving complex problems, you'll fit right in. Discover more about our unique values here. The RoleAs a Senior Site Reliability Engineer, you will oversee infrastructure management, working with Kubernetes clusters in the cloud, and optimizing workloads. Your responsibilities will include managing CI/CD pipelines, developing reusable workflows using GitHub Actions (or similar tools), conducting canary releases, and enhancing observability. You will design service-level dashboards, fine-tune alerts, and handle incident management across the organization. Additionally, you will enhance application performance through backend changes to optimize API and page performance, improve database efficiency, and eliminate bottlenecks. You will also contribute to platform tools by architecting scalable and efficient platforms for cross-pod use cases and improve developer velocity by building tools and workflows that enhance efficiency across engineering teams. Security responsibilities will include establishing guardrails on...
Join Alpheya, a pioneering B2B WealthTech startup headquartered in Abu Dhabi, and supported by renowned financial institutions such as BNY Mellon and Lunate. With a remarkable fundraising achievement of $300 million, we are developing a cutting-edge wealth technology platform aimed at transforming the wealth management landscape.Our mission is to empower our clients’ wealth franchises with unique experiences, comprehensive financial solutions, and actionable insights. Our innovative digital wealth management platform is designed to help banks and financial institutions across the Middle East effectively engage and expand their reach within affluent, High Net Worth (HNW), and Ultra High Net Worth (UHNW) investor segments.As a startup, we embrace the agility and collaboration of cross-functional teams while leveraging the resources and expertise of established organizations.
About Zuora Zuora helps businesses adapt and grow by powering modern models like subscriptions, usage-based pricing, and AI-enabled services. The Zuora platform supports launching new products, automating billing, and driving recurring revenue. With more than ten years at the forefront of the Subscription Economy, Zuora is evolving its platform to deliver a comprehensive quote-to-cash solution. This foundation is built for flexibility and AI-readiness, enabling companies to monetize their offerings efficiently. Role Overview: Senior Site Reliability Engineer Zuora is hiring a Senior Site Reliability Engineer in Bengaluru, Karnataka. This role drives reliability strategy and advances automation across AI-powered systems. The engineer will take on complex systems, influence architecture, and work closely with teams across the company. Key Responsibilities Define and improve SLOs, SLIs, and resilience patterns for critical services Develop automation using AI for detection, remediation, and forecasting Lead initiatives related to cloud infrastructure and Kubernetes platforms Enhance incident response processes and promote operational excellence Mentor engineers and help shape reliability practices across the organization Required Qualifications 8+ years in Site Reliability Engineering, DevOps, or large-scale production operations Deep experience with AWS, including EC2, EKS, VPC, IAM, RDS, S3, and CloudWatch Strong skills in Infrastructure-as-Code using Terraform (complex modules, state management, governance) Advanced programming and automation abilities in Python and Shell, with a history of building robust automation for production Expertise in Linux systems: performance tuning, security hardening, and advanced troubleshooting Hands-on experience with distributed systems and data streaming platforms such as Kafka, especially in high-throughput environments Comfort working independently on complex, ambiguous problems that have broad impact Demonstrated technical leadership on large-scale reliability or infrastructure projects, including setting direction, influencing design, and mentoring engineers for measurable results Location This position is based in Bengaluru, Karnataka, India.
Join Saviynt as a Senior Site Reliability Engineer, where you will play a crucial role in ensuring the reliability, availability, and performance of our innovative solutions. You will collaborate with cross-functional teams to implement and maintain scalable infrastructure, automate deployments, and develop monitoring systems that enhance our operational resilience.
Join our dynamic team at Harvey as a Staff Software Engineer specializing in Site Reliability Engineering (SRE). In this pivotal role, you will be responsible for ensuring the reliability, availability, and performance of our services. You will collaborate closely with software development teams to build and maintain scalable systems, implement automation strategies, and drive incident response and post-mortem analysis.The ideal candidate will possess a strong background in software engineering, a deep understanding of SRE principles, and a passion for operational excellence. You will play a crucial role in optimizing system performance and enhancing user experience.
ABOUT TEIKAMETRICSTeikametrics is leading the transformation of retail with our innovative Artificial Retail Intelligence platform. Our unique orchestration layer elevates AI capabilities tailored for major marketplaces including Amazon, Walmart, TikTok, and other emerging platforms. Discover more at www.teikametrics.com. ROLE OVERVIEWWe are seeking a talented Site Reliability Engineer to join our team in Bengaluru, India. You will play a crucial role in constructing and maintaining our cloud infrastructure that supports Teikametrics applications and platforms. Additionally, you will contribute to the development of internal DevOps tools and establish best practices to enhance software development and deployment efficiency. This highly impactful position involves guiding deployments utilizing cutting-edge technologies such as Docker, Kubernetes, and Terraform, significantly influencing our organization as a whole. In alignment with a DevOps model, you will collaborate with product development teams to design, deploy, and manage automation tools that enhance predictability, optimize efficiency, and lower operational costs. Note: Preference will be given to candidates based in Bangalore. TEAM DYNAMICSOur team manages services and infrastructure across AWS in conjunction with third-party providers, tackling the security and scalability challenges that arise. Our daily responsibilities encompass:Managing and scaling web applications and data platforms.Developing tools to implement DevOps and security best practices.Creating reusable and immutable infrastructure using Terraform.Continuously enhancing our infrastructure with monitoring, logging, and alerting solutions.Designing authentication and gateway solutions for our infrastructure and applications.Diagnosing and resolving application issues, advising development teams on design, deployment, and infrastructure decisions.Participating in on-call rotations, conducting post-mortems, and performing root cause analysis (RCA).
About AlphaSense: AlphaSense is the trusted partner for the world's leading companies, providing cutting-edge market intelligence that removes uncertainty from decision-making. Our platform leverages advanced AI to deliver critical insights from a vast array of trusted content, including equity research, company filings, event transcripts, expert calls, news, and trade journals.The recent acquisition of Tegus by AlphaSense in 2024 enhances our mission to empower professionals with AI-driven market insights. This collaboration will drive growth, innovation, and content expansion, enabling users to discover deeper insights from extensive content sets. Trusted by over 6,000 enterprise clients, including a significant portion of the S&P 500, AlphaSense was founded in 2011 and is headquartered in New York City, with a global workforce exceeding 2,000 employees across offices in the U.S., U.K., Finland, India, Singapore, Canada, and Ireland. Join us in shaping the future!About The Role:As we expand our Site Reliability Engineering (SRE) team, we are seeking an accomplished Staff Site Reliability Engineer to drive the future of reliability, scalability, and performance at AlphaSense. This high-impact, hands-on role will involve architecting core reliability platforms, leading incident responses, and fostering the adoption of SRE best practices throughout our global engineering organization.Your mission will be to configure our platform to meet the reliability standards of mission-critical systems, aiming for 99.99% uptime while continuously improving our systems and processes. This role transcends traditional system maintenance; it focuses on pioneering the platforms, practices, and culture that facilitate effective engineering scaling. You will serve as a mentor to fellow engineers, influence architectural decisions, and establish the technical standards for reliability throughout the organization.
Join our dynamic team as a Senior Site Reliability Engineer specializing in Oracle Applications DBA. In this pivotal role, you will be responsible for ensuring the reliability, availability, and performance of our Oracle applications, leveraging cutting-edge technology and best practices in site reliability engineering.As a key member of our team, you will collaborate with cross-functional teams to design, implement, and maintain scalable infrastructure solutions. Your expertise will contribute to enhancing our operational efficiency and improving customer satisfaction.
About AlphaSense: AlphaSense is the trusted partner for leading companies across the globe, providing them with the intelligence needed to make informed decisions. Harnessing cutting-edge AI technology, AlphaSense aggregates valuable insights from a plethora of trustworthy sources, including equity research, company filings, event transcripts, expert calls, news articles, trade journals, and proprietary client research.In 2024, AlphaSense's acquisition of Tegus marks a significant step towards enhancing our mission of empowering professionals with AI-driven market intelligence. This strategic collaboration aims to innovate and expand content capabilities, enabling our users to discover even deeper insights from extensive datasets. With a clientele that includes over 6,000 enterprise customers, including a majority of the S&P 500, AlphaSense was founded in 2011 and is headquartered in New York City, employing over 2,000 professionals across various global offices in the U.S., U.K., Finland, India, Singapore, Canada, and Ireland. Join our dynamic team!About The Role:We are expanding our Site Reliability Engineering (SRE) team and seeking a highly skilled Staff Site Reliability Engineer. In this pivotal role, you will be instrumental in shaping the future of reliability, scalability, and performance at AlphaSense. This hands-on position involves designing and implementing core reliability platforms, leading incident response efforts, and fostering a culture of SRE best practices throughout our global engineering teams.Your mission will be to develop our platform to the rigorous reliability standards of mission-critical systems, targeting an ambitious 99.99% uptime while continuously refining our systems and processes. This role transcends traditional maintenance; you will be at the forefront of pioneering platforms, practices, and a culture that facilitates effective engineering scalability. You will serve as a mentor to fellow engineers, influence architectural decisions, and set the benchmark for reliability across the organization.
Role Overview Nexthink is hiring a Senior Site Reliability Engineer in Bengaluru. This position focuses on improving the reliability, performance, and scalability of Nexthink’s systems to support smooth client operations. What You Will Do Work with teams across the company to design and build solutions that strengthen system stability. Monitor systems and respond to incidents to minimize downtime and disruptions. Contribute to ongoing efforts that keep our infrastructure resilient and efficient. Location This role is based in Bengaluru.
Join Saviynt as a Site Reliability Engineer and play a pivotal role in ensuring the reliability and performance of our cloud services. You will collaborate with development and operations teams to build and maintain scalable systems while implementing monitoring solutions to enhance system resilience. This position offers you the opportunity to contribute to a dynamic environment focused on continuous improvement and operational excellence.
Join our dynamic team at Ping Identity as a Site Reliability Engineer II, where you'll play a crucial role in maintaining and enhancing the reliability and performance of our services. You'll collaborate with talented engineers to implement scalable solutions that ensure optimal functionality and user experience.
Join Saviynt as a Principal Site Reliability Engineer and lead the charge in ensuring the reliability and performance of our systems. In this role, you will leverage your expertise in cloud infrastructure, automation, and monitoring to enhance our operational capabilities. Collaborate cross-functionally with development and operations teams to design and implement solutions that optimize system performance and availability.
Join Saviynt as a Staff Site Reliability Engineer, where you will play a critical role in enhancing the reliability and performance of our systems. You will work collaboratively with cross-functional teams to ensure seamless operations and create innovative solutions that drive efficiency.
6sense builds technology to help organizations grow, retain customers, and work more efficiently. The company encourages teams and individuals to reach their potential by supporting their goals with thoughtful tools and systems. Core values at 6sense include working as one team, staying curious, doing the right thing, owning outcomes, and creating a sense of belonging. Team members are expected to show initiative, act with integrity, and focus on delivering real value to customers. Meeting challenges directly and influencing the direction of the company’s technology are encouraged. Role overview The Site Reliability Engineering Manager, based in Bengaluru, leads a team dedicated to the scalability, reliability, and performance of 6sense’s main infrastructure and customer-facing services. This position blends technical leadership with operational oversight and people management. What you will do Mentor and guide a team of SREs who maintain and enhance critical systems Establish the direction for reliability engineering and define best practices Oversee system availability and performance, working to minimize downtime Encourage a culture of proactive problem-solving and continuous improvement Collaborate with engineering, product, and security teams to design systems that can scale with customer growth Impact and collaboration This leadership role is highly visible within 6sense. The SRE Manager helps shape how infrastructure supports both rapid company expansion and evolving customer needs.
Join our innovative team at Arista Networks as a Software Developer (Site Reliability Engineer) focusing on CloudVision as a Service (CVaaS). In this role, you will contribute to the development and maintenance of our robust cloud services, ensuring high availability and scalability. You will collaborate with cross-functional teams to implement best practices in software development and operations.
Your Role: As the Engineering Manager for Site Reliability (SRE) at Moveworks, you will merge software and systems engineering to create and maintain large-scale, distributed, and fault-tolerant systems. Join us as a pivotal member of our SRE team in Bengaluru, where you will be instrumental in architecting and overseeing Moveworks' AI cloud infrastructure and strategy. In a rapidly growing environment, you will design and manage resilient and secure cloud infrastructure, enabling our products to operate reliably and allowing our engineering teams to rapidly build and release customer-facing features. You will collaborate with teams across platform, infrastructure, machine learning, search, data, DevOps, and frontend, building systems that empower these teams to deliver high-quality software promptly. This may involve enhancing CI/CD pipelines, enabling blue/green deployments, creating and managing canary environments, and reducing the risk of faulty code reaching production. Enhance the observability and reliability of Moveworks systems by developing and managing monitoring and alerting infrastructure. Improve debuggability by creating systems that facilitate issue resolution in production and analyze performance. Architect, design, and lead projects aimed at bolstering the reliability of our applications and systems. Serve as a technical leader for adjacent teams based in Bengaluru.
About EarnInAt EarnIn, we are trailblazers in earned wage access, dedicated to creating solutions that provide immediate financial flexibility for individuals facing the challenge of living paycheck to paycheck. Our community members enjoy the freedom to access their earnings as they are earned, with opportunities to spend, save, and enhance their financial future without the burden of fees, interest, or credit checks.Our leadership team boasts a wealth of experience, backed by prestigious investors such as A16Z, Matrix Partners, DST, and Ribbit Capital, alongside a robust core business poised for significant growth. We are on an exciting trajectory and are eager to welcome world-class talent to join us in shaping our future.POSITION SUMMARYWe are committed to delivering an exceptional product experience for our community members. Collaborating closely with all teams, we share the responsibility of rapidly delivering production-ready features. Our focus includes building and contributing to infrastructure, reliability tools, and best practices that enable swift and safe deployments. We emphasize aspects such as effective alert management, comprehensive runbooks, clear Service Level Objectives (SLOs), and ensuring that deployments are seamless and uneventful. As a Senior Site Reliability Engineer, you will serve as a technical leader, designing, monitoring, and operating our production systems. Your attention will be on the overall service behavior, including reliability, performance, failure modes, and enhancing the engineering experience.This role is hybrid, based in our Bengaluru office, as part of our expanding operations. EarnIn offers a comprehensive benefits package, including healthcare, internet and cell phone reimbursements, a learning and development stipend, and opportunities for collaboration and travel to our Palo Alto HQ and Bangkok site. Our salary ranges are determined based on role, level, and location.
About CodeRabbitCodeRabbit is at the forefront of innovation in research and development, dedicated to creating remarkably efficient systems for human-machine collaboration. Our mission is to redefine software development through the integration of cutting-edge Gen AI-driven code review technology, fostering a powerful synergy between human creativity and sophisticated algorithms that surpass the capabilities of individual engineers. By merging advanced language models with human insight, we aim to enhance the efficiency and quality of software development.The RoleWe are looking for a skilled Site Reliability Engineer to enhance our Platform Engineering team in Bengaluru. In this pivotal role, you will be key in ensuring the exceptional availability, performance, and scalability of CodeRabbit's AI-driven code review platform. This position bridges software engineering and systems operations, where you will construct the essential platforms and automation that empower our engineering teams to deploy, monitor, and scale our services reliably.As a Site Reliability Engineer at CodeRabbit, your responsibilities will include fortifying the scale, reliability, and security of our critical services that manage millions of code reviews, developing sophisticated automation, and overseeing the infrastructure that powers our AI-driven analysis engine. You will work with state-of-the-art technologies, including large language models and distributed architectures designed for significant scale.
Mar 26, 2026
Sign in to browse more jobs
Create account — see all 2,899 results
Tailoring 0 resumes…
Tailoring 0 resumes…
We'll move completed jobs to Ready to Apply automatically.