Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.
Unlock Your Potential
Generate Job-Optimized Resume
One Click And Our AI Optimizes Your Resume to Match The Job Description.
Is Your Resume Optimized For This Role?
Find Out If You're Highlighting The Right Skills And Fix What's Missing
Experience Level
Senior
Qualifications
The ideal candidate will possess a strong background in site reliability engineering with proven experience in managing production reliability across data ingestion pipelines, backend services, and Kubernetes deployments. You should be adept at incident response and reliability engineering, capable of diagnosing and resolving complex production issues, and implementing permanent fixes to enhance system stability.
About the job
Join Alpheya, a pioneering B2B WealthTech startup headquartered in Abu Dhabi, and supported by renowned financial institutions such as BNY Mellon and Lunate. With a remarkable fundraising achievement of $300 million, we are developing a cutting-edge wealth technology platform aimed at transforming the wealth management landscape.
Our mission is to empower our clients’ wealth franchises with unique experiences, comprehensive financial solutions, and actionable insights. Our innovative digital wealth management platform is designed to help banks and financial institutions across the Middle East effectively engage and expand their reach within affluent, High Net Worth (HNW), and Ultra High Net Worth (UHNW) investor segments.
As a startup, we embrace the agility and collaboration of cross-functional teams while leveraging the resources and expertise of established organizations.
About Alpheya
Alpheya is an innovative WealthTech startup based in Abu Dhabi, dedicated to creating a state-of-the-art wealth technology platform that revolutionizes financial management for institutions in the Middle East. Backed by BNY Mellon and Lunate, Alpheya is positioned for significant growth and impact in the financial sector.
Join Alpheya, a pioneering B2B WealthTech startup headquartered in Abu Dhabi, and supported by renowned financial institutions such as BNY Mellon and Lunate. With a remarkable fundraising achievement of $300 million, we are developing a cutting-edge wealth technology platform aimed at transforming the wealth management landscape.Our mission is to empower our clients’ wealth franchises with unique experiences, comprehensive financial solutions, and actionable insights. Our innovative digital wealth management platform is designed to help banks and financial institutions across the Middle East effectively engage and expand their reach within affluent, High Net Worth (HNW), and Ultra High Net Worth (UHNW) investor segments.As a startup, we embrace the agility and collaboration of cross-functional teams while leveraging the resources and expertise of established organizations.
Why Choose Headout?We’re a Rocketship: 9-Figure Revenue, Record Growth, and ProfitableWith an impressive revenue of $130M and a presence in over 100 cities, Headout has achieved 18 months of profitability, making it the fastest-growing marketplace in the travel industry. We’ve secured over $60M from leading investors and are committed to building a sustainable business for the long term. Our growth story is just beginning!Our Mission MattersIn today's digital age, enhancing our human experiences is crucial. At Headout, we aim to provide the easiest, quickest, and most enjoyable way to explore real-life experiences—from immersive tours to museums and live events, we cover it all.Why Join Us Now?With a solid foundation and tremendous potential ahead, this is an exciting time to join Headout. Having reached profitability and gained momentum, we have only just begun to build. If you're seeking a role where your contributions will make a significant impact, now is the perfect time to join our team!Our CultureRevolutionizing the travel industry is challenging but incredibly rewarding. We value ownership, craftsmanship, and impact, and we're dedicated to doing the best work of our careers. If you're a builder who thrives on solving complex problems, you'll fit right in. Discover more about our unique values here. The RoleAs a Senior Site Reliability Engineer, you will oversee infrastructure management, working with Kubernetes clusters in the cloud, and optimizing workloads. Your responsibilities will include managing CI/CD pipelines, developing reusable workflows using GitHub Actions (or similar tools), conducting canary releases, and enhancing observability. You will design service-level dashboards, fine-tune alerts, and handle incident management across the organization. Additionally, you will enhance application performance through backend changes to optimize API and page performance, improve database efficiency, and eliminate bottlenecks. You will also contribute to platform tools by architecting scalable and efficient platforms for cross-pod use cases and improve developer velocity by building tools and workflows that enhance efficiency across engineering teams. Security responsibilities will include establishing guardrails on...
Join our dynamic team as a Senior Reliability Engineer at SanDisk, where you will play a pivotal role in enhancing the reliability and performance of our cutting-edge storage solutions. Your expertise will help drive initiatives that ensure optimal product performance, reliability, and customer satisfaction.
About Zuora Zuora helps businesses adapt and grow by powering modern models like subscriptions, usage-based pricing, and AI-enabled services. The Zuora platform supports launching new products, automating billing, and driving recurring revenue. With more than ten years at the forefront of the Subscription Economy, Zuora is evolving its platform to deliver a comprehensive quote-to-cash solution. This foundation is built for flexibility and AI-readiness, enabling companies to monetize their offerings efficiently. Role Overview: Senior Site Reliability Engineer Zuora is hiring a Senior Site Reliability Engineer in Bengaluru, Karnataka. This role drives reliability strategy and advances automation across AI-powered systems. The engineer will take on complex systems, influence architecture, and work closely with teams across the company. Key Responsibilities Define and improve SLOs, SLIs, and resilience patterns for critical services Develop automation using AI for detection, remediation, and forecasting Lead initiatives related to cloud infrastructure and Kubernetes platforms Enhance incident response processes and promote operational excellence Mentor engineers and help shape reliability practices across the organization Required Qualifications 8+ years in Site Reliability Engineering, DevOps, or large-scale production operations Deep experience with AWS, including EC2, EKS, VPC, IAM, RDS, S3, and CloudWatch Strong skills in Infrastructure-as-Code using Terraform (complex modules, state management, governance) Advanced programming and automation abilities in Python and Shell, with a history of building robust automation for production Expertise in Linux systems: performance tuning, security hardening, and advanced troubleshooting Hands-on experience with distributed systems and data streaming platforms such as Kafka, especially in high-throughput environments Comfort working independently on complex, ambiguous problems that have broad impact Demonstrated technical leadership on large-scale reliability or infrastructure projects, including setting direction, influencing design, and mentoring engineers for measurable results Location This position is based in Bengaluru, Karnataka, India.
Roku’s Platform Infrastructure team supports the systems behind one of the largest TV streaming platforms, serving over 100 million users and enabling billions in annual transactions. The team’s work underpins the reliability and performance of Roku’s services across the U.S., Canada, and Mexico. Role overview The Senior Software Engineer - Site Reliability Engineering position is based in Bengaluru, India. This role centers on applying SRE principles to maintain and improve cloud infrastructure and drive automation across Roku’s platform. The team works with technologies such as Kubernetes, Istio, Envoy, and various observability tools to operate at internet scale. What you will do Design and develop large-scale, reliable systems that support Roku’s streaming platform Apply SRE best practices to enhance system reliability and efficiency Automate infrastructure management and operational tasks Collaborate with engineers across departments to deliver solutions that impact the entire company Requirements Significant experience in Site Reliability Engineering or related software engineering roles Proven ability to design and build large-scale systems Hands-on experience with cloud infrastructure and automation Familiarity with technologies such as Kubernetes, Istio, Envoy, and observability tools Strong organizational skills, curiosity, and a drive to learn
About Cognite Cognite builds AI and data solutions for industrial digitalization. The team works with companies around the world to help them accelerate digital transformation and improve operational efficiency. Drawing on a strong industrial background and a growing set of AI tools, including low-code AI agents, Cognite tackles complex, high-impact challenges across industries. The culture values agility, ownership, and a willingness to question established ways of working. People who see challenges as a chance to move forward tend to thrive here. Cognite’s Moonshot initiative targets unlocking $100 billion in customer value by 2035, aiming to reshape how global industries operate.
Join Saviynt as a Senior Site Reliability Engineer, where you will play a crucial role in ensuring the reliability, availability, and performance of our innovative solutions. You will collaborate with cross-functional teams to implement and maintain scalable infrastructure, automate deployments, and develop monitoring systems that enhance our operational resilience.
Join our dynamic team as a Senior Site Reliability Engineer specializing in Oracle Applications DBA. In this pivotal role, you will be responsible for ensuring the reliability, availability, and performance of our Oracle applications, leveraging cutting-edge technology and best practices in site reliability engineering.As a key member of our team, you will collaborate with cross-functional teams to design, implement, and maintain scalable infrastructure solutions. Your expertise will contribute to enhancing our operational efficiency and improving customer satisfaction.
Role Overview Nexthink is hiring a Senior Site Reliability Engineer in Bengaluru. This position focuses on improving the reliability, performance, and scalability of Nexthink’s systems to support smooth client operations. What You Will Do Work with teams across the company to design and build solutions that strengthen system stability. Monitor systems and respond to incidents to minimize downtime and disruptions. Contribute to ongoing efforts that keep our infrastructure resilient and efficient. Location This role is based in Bengaluru.
ABOUT TEIKAMETRICSTeikametrics is leading the transformation of retail with our innovative Artificial Retail Intelligence platform. Our unique orchestration layer elevates AI capabilities tailored for major marketplaces including Amazon, Walmart, TikTok, and other emerging platforms. Discover more at www.teikametrics.com. ROLE OVERVIEWWe are seeking a talented Site Reliability Engineer to join our team in Bengaluru, India. You will play a crucial role in constructing and maintaining our cloud infrastructure that supports Teikametrics applications and platforms. Additionally, you will contribute to the development of internal DevOps tools and establish best practices to enhance software development and deployment efficiency. This highly impactful position involves guiding deployments utilizing cutting-edge technologies such as Docker, Kubernetes, and Terraform, significantly influencing our organization as a whole. In alignment with a DevOps model, you will collaborate with product development teams to design, deploy, and manage automation tools that enhance predictability, optimize efficiency, and lower operational costs. Note: Preference will be given to candidates based in Bangalore. TEAM DYNAMICSOur team manages services and infrastructure across AWS in conjunction with third-party providers, tackling the security and scalability challenges that arise. Our daily responsibilities encompass:Managing and scaling web applications and data platforms.Developing tools to implement DevOps and security best practices.Creating reusable and immutable infrastructure using Terraform.Continuously enhancing our infrastructure with monitoring, logging, and alerting solutions.Designing authentication and gateway solutions for our infrastructure and applications.Diagnosing and resolving application issues, advising development teams on design, deployment, and infrastructure decisions.Participating in on-call rotations, conducting post-mortems, and performing root cause analysis (RCA).
About AlphaSense: AlphaSense is the trusted partner for the world's leading companies, providing cutting-edge market intelligence that removes uncertainty from decision-making. Our platform leverages advanced AI to deliver critical insights from a vast array of trusted content, including equity research, company filings, event transcripts, expert calls, news, and trade journals.The recent acquisition of Tegus by AlphaSense in 2024 enhances our mission to empower professionals with AI-driven market insights. This collaboration will drive growth, innovation, and content expansion, enabling users to discover deeper insights from extensive content sets. Trusted by over 6,000 enterprise clients, including a significant portion of the S&P 500, AlphaSense was founded in 2011 and is headquartered in New York City, with a global workforce exceeding 2,000 employees across offices in the U.S., U.K., Finland, India, Singapore, Canada, and Ireland. Join us in shaping the future!About The Role:As we expand our Site Reliability Engineering (SRE) team, we are seeking an accomplished Staff Site Reliability Engineer to drive the future of reliability, scalability, and performance at AlphaSense. This high-impact, hands-on role will involve architecting core reliability platforms, leading incident responses, and fostering the adoption of SRE best practices throughout our global engineering organization.Your mission will be to configure our platform to meet the reliability standards of mission-critical systems, aiming for 99.99% uptime while continuously improving our systems and processes. This role transcends traditional system maintenance; it focuses on pioneering the platforms, practices, and culture that facilitate effective engineering scaling. You will serve as a mentor to fellow engineers, influence architectural decisions, and establish the technical standards for reliability throughout the organization.
Join SanDisk as a Principal Engineer in Reliability Engineering, where you will lead the charge in enhancing the reliability and performance of our cutting-edge storage solutions. This pivotal role involves applying advanced engineering principles to drive product reliability and test methodologies, ensuring our products meet the highest standards.
SanDisk is hiring a Senior Manager, Product Development Engineering, with a focus on memory reliability. This position is located in Bengaluru and centers on leading technical teams to enhance the reliability of memory products. Key Responsibilities Direct projects aimed at improving memory reliability across multiple product lines Supervise failure analysis efforts and conduct root cause investigations Use expertise in device physics to uphold high quality standards Collaboration This leadership role works closely with engineers and technical specialists who are committed to advancing memory technologies and maintaining product excellence.
About EarnInAt EarnIn, we are trailblazers in earned wage access, dedicated to creating solutions that provide immediate financial flexibility for individuals facing the challenge of living paycheck to paycheck. Our community members enjoy the freedom to access their earnings as they are earned, with opportunities to spend, save, and enhance their financial future without the burden of fees, interest, or credit checks.Our leadership team boasts a wealth of experience, backed by prestigious investors such as A16Z, Matrix Partners, DST, and Ribbit Capital, alongside a robust core business poised for significant growth. We are on an exciting trajectory and are eager to welcome world-class talent to join us in shaping our future.POSITION SUMMARYWe are committed to delivering an exceptional product experience for our community members. Collaborating closely with all teams, we share the responsibility of rapidly delivering production-ready features. Our focus includes building and contributing to infrastructure, reliability tools, and best practices that enable swift and safe deployments. We emphasize aspects such as effective alert management, comprehensive runbooks, clear Service Level Objectives (SLOs), and ensuring that deployments are seamless and uneventful. As a Senior Site Reliability Engineer, you will serve as a technical leader, designing, monitoring, and operating our production systems. Your attention will be on the overall service behavior, including reliability, performance, failure modes, and enhancing the engineering experience.This role is hybrid, based in our Bengaluru office, as part of our expanding operations. EarnIn offers a comprehensive benefits package, including healthcare, internet and cell phone reimbursements, a learning and development stipend, and opportunities for collaboration and travel to our Palo Alto HQ and Bangkok site. Our salary ranges are determined based on role, level, and location.
About AlphaSense: AlphaSense is the trusted partner for leading companies across the globe, providing them with the intelligence needed to make informed decisions. Harnessing cutting-edge AI technology, AlphaSense aggregates valuable insights from a plethora of trustworthy sources, including equity research, company filings, event transcripts, expert calls, news articles, trade journals, and proprietary client research.In 2024, AlphaSense's acquisition of Tegus marks a significant step towards enhancing our mission of empowering professionals with AI-driven market intelligence. This strategic collaboration aims to innovate and expand content capabilities, enabling our users to discover even deeper insights from extensive datasets. With a clientele that includes over 6,000 enterprise customers, including a majority of the S&P 500, AlphaSense was founded in 2011 and is headquartered in New York City, employing over 2,000 professionals across various global offices in the U.S., U.K., Finland, India, Singapore, Canada, and Ireland. Join our dynamic team!About The Role:We are expanding our Site Reliability Engineering (SRE) team and seeking a highly skilled Staff Site Reliability Engineer. In this pivotal role, you will be instrumental in shaping the future of reliability, scalability, and performance at AlphaSense. This hands-on position involves designing and implementing core reliability platforms, leading incident response efforts, and fostering a culture of SRE best practices throughout our global engineering teams.Your mission will be to develop our platform to the rigorous reliability standards of mission-critical systems, targeting an ambitious 99.99% uptime while continuously refining our systems and processes. This role transcends traditional maintenance; you will be at the forefront of pioneering platforms, practices, and a culture that facilitates effective engineering scalability. You will serve as a mentor to fellow engineers, influence architectural decisions, and set the benchmark for reliability across the organization.
Sandisk is seeking a Staff Engineer specializing in Software Reliability Engineering based in Bengaluru. The position focuses on enhancing the reliability and performance of software products throughout the company. Key responsibilities Work closely with cross-functional teams to identify software issues and develop solutions Drive initiatives that improve system reliability and scalability Support ongoing efforts to ensure Sandisk’s software remains dependable and efficient Role overview This role centers on maintaining and improving the quality of software systems. Collaboration and problem-solving are essential, as the work involves addressing technical challenges that impact software stability and user experience.
Join Saviynt as a Site Reliability Engineer and play a pivotal role in ensuring the reliability and performance of our cloud services. You will collaborate with development and operations teams to build and maintain scalable systems while implementing monitoring solutions to enhance system resilience. This position offers you the opportunity to contribute to a dynamic environment focused on continuous improvement and operational excellence.
Join our dynamic team at Ping Identity as a Site Reliability Engineer II, where you'll play a crucial role in maintaining and enhancing the reliability and performance of our services. You'll collaborate with talented engineers to implement scalable solutions that ensure optimal functionality and user experience.
Join Saviynt as a Principal Site Reliability Engineer and lead the charge in ensuring the reliability and performance of our systems. In this role, you will leverage your expertise in cloud infrastructure, automation, and monitoring to enhance our operational capabilities. Collaborate cross-functionally with development and operations teams to design and implement solutions that optimize system performance and availability.
Join Saviynt as a Staff Site Reliability Engineer, where you will play a critical role in enhancing the reliability and performance of our systems. You will work collaboratively with cross-functional teams to ensure seamless operations and create innovative solutions that drive efficiency.
Nov 11, 2025
Sign in to browse more jobs
Create account — see all 2,582 results
Tailoring 0 resumes…
Tailoring 0 resumes…
We'll move completed jobs to Ready to Apply automatically.