Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.
Unlock Your Potential
Generate Job-Optimized Resume
One Click And Our AI Optimizes Your Resume to Match The Job Description.
Is Your Resume Optimized For This Role?
Find Out If You're Highlighting The Right Skills And Fix What's Missing
Qualifications
Strong understanding of systems architecture and cloud infrastructure. Proficiency in scripting and automation tools. Experience with monitoring and incident management tools. Excellent problem-solving skills and ability to work collaboratively. Familiarity with DevOps practices and methodologies.
About the job
Join our innovative team at Newton as a Site Reliability Engineer, where you'll play a crucial role in ensuring the reliability and performance of our systems. In this fully remote position, you will collaborate with engineering and operations teams to develop solutions that enhance system uptime and efficiency.
Your expertise will help us transition and maintain our infrastructure, ensuring our services are resilient and scalable. This is an exciting opportunity to contribute to a company that values innovation and teamwork.
About Newton
Newton is a forward-thinking technology company committed to building reliable and efficient systems. We prioritize employee growth and encourage a culture of innovation. Our diverse team thrives in a collaborative environment, where your ideas are valued, and your contributions make a real impact.
Similar jobs
1 - 20 of 3,240 Jobs
Search for Lead Site Reliability Engineer At Movable Ink Toronto
At Movable Ink, we empower marketers with cutting-edge content personalization through data-driven content creation and AI-driven decision-making. Our innovative platform is trusted by top global brands to enhance revenue, streamline workflows, and increase marketing agility. With our headquarters in New York City and a talented team of nearly 600 employees, Movable Ink has a presence across North America, Central America, Europe, Australia, and Japan.As a Lead Site Reliability Engineer, you will leverage your technical expertise and leadership skills to oversee infrastructure and software development initiatives. You will play a pivotal role in designing and evolving key systems within our multi-cloud, multi-region content serving platform, which handles over 25 billion requests daily. By fostering architectural vision, cross-team collaboration, and mentorship, you will spearhead reliability initiatives and define the technical strategies necessary for scaling our platform to accommodate 50 billion requests per day and beyond.
Join Movable Ink, where we revolutionize content personalization for marketers through dynamic, data-driven content generation and AI decision-making. Our innovative solutions help leading brands worldwide enhance revenue, streamline workflows, and amplify marketing agility. With our headquarters in New York City and nearly 600 employees, we serve a diverse global client base across North America, Central America, Europe, Australia, and Japan.We are seeking a strategic and proactive Senior Recruiter to join Movable Ink’s People Team, dedicated to fulfilling our expanding talent acquisition needs. In this pivotal role, you will oversee the complete recruiting cycle across various teams and locations. Your collaboration with hiring leaders will be essential in understanding their specific needs and devising effective hiring strategies. You'll expertly balance speed and quality in recruiting while ensuring an exceptional experience for candidates and hiring managers alike. By utilizing data-driven insights and problem-solving skills, you will significantly contribute to the ongoing refinement of our talent acquisition strategy, aimed at attracting and retaining the exceptional talent necessary for Movable Ink's continued success.Key Responsibilities:Independently manage the entire recruiting process, serving as a strategic partner to align hiring strategies with business objectives and team dynamics.Create and implement innovative sourcing strategies that leverage extensive market knowledge to attract and engage a high-quality, diverse candidate pool.Provide guidance and coaching to hiring managers on best practices for recruitment, interview frameworks, and candidate evaluation to ensure consistent, high-quality hiring decisions.Meticulously oversee the candidate experience from initial outreach to offer negotiation and onboarding coordination, ensuring seamless communication and efficient processes.Utilize recruitment data and metrics to identify bottlenecks in the pipeline, forecast hiring needs, and proactively adjust strategies.Lead or significantly contribute to People team initiatives that enhance recruitment processes and foster organizational growth.
Full-time|CA$140K/yr - CA$180K/yr|On-site|Movable Ink - Toronto
Join Movable Ink, where we empower marketers with cutting-edge data-activated content generation and AI-driven decision making. Trusted by some of the world's most innovative brands, we help maximize revenue, streamline workflows, and enhance marketing agility. With headquarters in New York City and a diverse team of nearly 600 professionals, our reach extends across North America, Central America, Europe, Australia, and Japan.We are on the lookout for a skilled Senior Backend Engineer specializing in Distributed Systems to join our Activation Team. This dynamic group is dedicated to transforming extensive data sets into actionable insights. In this role, you will design and implement robust distributed software systems capable of processing data at high speeds and scale. The position offers the chance to innovate and create substantial impact by developing sophisticated multi-tiered systems that deliver exceptional value to our esteemed clients. You will be responsible for crafting and deploying high-quality, scalable AI solutions while producing technically impressive products.
Full-time|$211.5K/yr - $258.5K/yr|On-site|Toronto, ON
At Relay, we are revolutionizing the way self-made business owners manage their finances through our cutting-edge digital banking platform. Our mission is to empower entrepreneurs with the tools and knowledge they need to achieve financial clarity, confidence, and control over their earnings. By transforming cash flow management from a source of stress into a clear, actionable insight, we help our customers build stronger and more resilient businesses.As we continue to grow, the reliability, performance, and resilience of our platform have become critical components of our customer experience and overall business success.We are currently seeking an Engineering Manager to lead our Site Reliability Engineering (SRE) team. In this pivotal role, you will oversee the scalability, reliability, and robustness of Relay's systems. This position transcends infrastructure management and incident response; it is a leadership opportunity that sits at the nexus of technology, team dynamics, and business strategy. You will mentor and manage a talented SRE team, influence how reliability is integrated across the organization, and ensure our systems can safely scale in response to increasing customer demands and complexity.If you thrive in technically demanding environments and are passionate about fostering strong teams, a healthy workplace culture, and effective cross-functional collaboration, this position is designed for you.
Join Tenstorrent as a Site Reliability Engineer, where you will play a crucial role in ensuring the reliability and performance of our cutting-edge systems. As a member of our dedicated engineering team, you will work on innovative solutions to enhance our infrastructure and streamline operations. Your expertise will help us deliver exceptional service and uptime to our customers.
Empower Every Identity, from AI to HumanIdentity is the cornerstone of unlocking AI's potential. At Okta, we secure AI by creating a trustworthy, neutral infrastructure that allows organizations to confidently navigate this transformative era. This mission demands an unwavering commitment to addressing intricate challenges with significant real-world implications. We seek innovative builders who act with speed and urgency and execute with exceptional proficiency.This is your chance to engage in work that can define your career. We are fully dedicated to this mission. If you share this passion, we want to hear from you.Join Us in Securing Every Identity, from AI to HumanOkta is at the forefront of providing a superior authentication experience for hundreds of millions globally. Our focus on reliability forms the bedrock of our product, with a strong commitment to surpassing customer expectations for availability being a fundamental engineering priority. As a Senior Site Reliability Engineer, you will be part of our SRE team, ensuring our production systems are not only fully operational but also resilient, scalable, and poised for remarkable growth. This role goes beyond mere maintenance; it is about playing a significant role in enhancing the core robustness and resilience of our platform. You will be a proactive builder, developing solutions that inherently boost our system's reliability.Your Responsibilities:Craft and develop custom software in Go to bolster the platform’s reliability and resilience.Collaborate with engineering teams to integrate reliability principles, enhancing the availability, performance, and observability of our services.Utilize your profound understanding of infrastructure and observability to pinpoint improvement opportunities within the product and implement effective solutions.Participate in our on-call rotation, providing swift, effective responses to critical incidents and utilizing your expertise to troubleshoot, mitigate, or accurately escalate production issues.Enhance our SRE tooling and processes, focusing on automation and operational efficiency.Establish, document, and promote reliability best practices throughout the organization.
A Few Important Notes:Join a Profitable B2B SaaS company with teams primarily located in North America.This position is predominantly remote, with a requirement to meet in Toronto once a month.Candidates must possess the legal right to work in Canada; we are unable to provide visa sponsorship.As our platform continues to expand, we are actively seeking a Senior Site Reliability Engineer (SRE) / Cloud Engineer.Experience with Azure is highly prioritized as it is our primary cloud platform.About Our Company:We are recognized as one of the leading retail analytics platforms, empowering marketing teams and brands to decode retail data and execute targeted media campaigns without the need for coding. Our services enhance client understanding of customer behavior and maximize ROI on marketing campaigns, with notable clients including Home Depot.Utilize a modern cloud stack, with a focus on Azure, CI/CD, containerization, and distributed computing technologies.About You:We are in search of a dynamic and skilled Senior SRE/Cloud Engineer who is eager to take on a pivotal role in managing our Cloud Operations, ensuring uptime, reliability, and automation.Key Responsibilities:Collaborate with software engineering teams to design, implement, and maintain CI/CD pipelines for rapid and reliable software releases.Automate and optimize infrastructure provisioning, configuration, and management processes utilizing industry-standard tools and methodologies.Implement and manage containerization and orchestration technologies to enhance scalability and resource efficiency.Own the end-to-end availability and performance of our cloud infrastructure; proactively identify potential issues and implement automation to mitigate recurrence.Participate in an on-call rotation to ensure system stability and responsiveness during off-hours.Lead the development and implementation of service-level objectives crucial for maintaining product reliability.
Veeva Systems is a mission-driven leader in industry cloud technology, dedicated to accelerating the delivery of therapies to patients in the life sciences sector. As one of the fastest-growing SaaS companies ever, we surpassed $2 billion in revenue last fiscal year with significant growth prospects ahead.Central to Veeva's mission are our core values: Do the Right Thing, Customer Success, Employee Success, and Speed. Notably, we made history in 2021 by becoming a public benefit corporation (PBC), which legally commits us to balance the interests of our customers, employees, society, and investors.As a Work Anywhere company, we empower you to choose your work environment, whether it's from home or in our office, enabling you to excel in your preferred setting.Be part of our journey in transforming the life sciences industry and making a positive impact on our customers, employees, and communities.The RoleWe are seeking a talented Senior Site Reliability Engineer to join our Vault Platform team. In this role, you will be instrumental in ensuring the scalability and reliability of our enterprise applications. You will face complex challenges on a global scale, leveraging your extensive knowledge of Java and modern open-source technologies to create a meaningful impact on our production systems.The ideal candidate will possess substantial experience with Java applications and cutting-edge open-source technologies, particularly within the context of enterprise software development or a high-growth tech environment. As a Senior SRE, you should have a natural curiosity and a strong aptitude for problem-solving. Your unique engineering perspective will be critical as you understand how systems integrate in production to function efficiently on a global scale, supporting hundreds of customers across North America, Europe, and Asia.
Pinterest is hiring a Senior Site Reliability Engineer in Toronto, ON, Canada. The focus of this role is to ensure that Pinterest’s services remain reliable, scalable, and perform well as the platform grows. Working closely with software engineers, this position involves designing and implementing solutions that strengthen system reliability and efficiency. Key responsibilities Partner with engineering teams to maintain and enhance the reliability of Pinterest’s services Design and implement improvements to support scalability and performance Troubleshoot and resolve service issues to reduce downtime Requirements Extensive experience in site reliability engineering or a closely related field Strong technical background with proven problem-solving abilities Comfort working alongside software engineers to improve systems This position is located in Toronto, ON, Canada.
Full-time|CA$243K/yr - CA$297K/yr|On-site|Toronto, ON
At Relay, we empower self-made business owners with a digital banking platform that transforms financial management into a source of clarity, confidence, and control. Our mission is to replace financial uncertainty with genuine visibility, enabling entrepreneurs to convert their hard work into enduring success. By alleviating the stress of cash flow management, we provide the tools necessary for owners to operate robust and resilient businesses.As Relay continues its growth trajectory, the reliability, performance, and resilience of our platform have become integral to both our customer experience and overall business success.This senior leadership position is crucial in steering a team of Site Reliability Engineers while shaping how reliability strategies influence engineering and product decisions throughout the organization. You will determine the future direction of the SRE function, promote operational excellence, and assist the company in anticipating and managing scale challenges before they pose risks.If you thrive on tackling complex systems, leading organizations, and building resilient platforms that customers depend on daily, we are eager to connect with you!Key ResponsibilitiesLead and enhance Relay’s Site Reliability Engineering function, establishing strategic direction as the company scales.Define and implement a long-term reliability roadmap, making informed trade-offs under real business and capacity constraints.Act as the senior reliability voice in discussions involving engineering and product leadership.Influence the integration of reliability considerations into product planning, architectural decisions, and delivery processes.Serve as a senior escalation point during critical production incidents, ensuring effective communication and thorough follow-up actions.Enhance Relay’s observability, performance, and operational maturity practices across teams.Establish and uphold standards concerning SLOs, operational readiness, incident management, and continuous improvement.Collaborate with stakeholders in Engineering, Product, Data, and Finance to balance velocity, risk, performance, and cost.Build and nurture a high-performing SRE organization capable of supporting future growth.
Full-time|CA$144K/yr - CA$200K/yr|Hybrid|Montreal; Toronto
The Storage Layer Services (SLS) team at MongoDB is embarking on an innovative journey to re-architect our cloud storage layer, forming the core of our next-generation cloud storage architecture. This newly established team is dedicated to creating high-performance, multi-tenant distributed storage services that not only enhance our current Atlas storage stack but also enable more efficient customer workloads. As a Senior Site Reliability Engineer, you will collaborate closely with teams responsible for these storage services to establish Service Level Objectives (SLOs), develop capacity plans, and guarantee the reliability, durability, and operational safety of the foundational storage layer supporting Atlas. By joining our small team of seasoned SREs, you will play an integral role in executing a multi-year roadmap for MongoDB’s cloud storage architecture. This position is open to candidates based in our Toronto or Montreal offices or those working remotely from anywhere in Canada, provided they are located in the Eastern or Central time zones.
Momentum Financial Services Group (MFSG) is the company behind Money Mart, Canada’s largest non-bank branch network. With over four decades of experience, MFSG delivers financial solutions for underserved communities, including short-term loans, money transfers, and prepaid cards. Each year, millions of customers rely on these services for timely financial support. Role Overview: Site Reliability Engineer The Site Reliability Engineer plays a key role in keeping MFSG’s digital banking and financial services platforms available, responsive, and resilient. This position centers on automating operational tasks, setting and maintaining service-level objectives, and engineering systems to withstand and recover from failures. Daily work involves close collaboration with engineering, DevOps, QA, cybersecurity, and compliance teams to ensure platform reliability meets both technical and regulatory requirements. The role also emphasizes proactive monitoring, incident response, and ongoing improvements to the software delivery process to reduce production risk. Why Join Momentum Financial Services Group? Competitive compensation that reflects experience and current market rates Annual bonus based on individual and company achievements Comprehensive benefits including health and dental coverage with premiums fully paid, plus Employee Assistance Program access Retirement planning support to help prepare for the future Hybrid work model offering flexibility between remote work and in-office collaboration at the Toronto headquarters Employee perks such as tuition reimbursement, professional development, Perkopolis discounts, and recognition programs Location Toronto, Canada (hybrid work model)
About Rootly At Rootly, we are dedicated to revolutionizing how organizations manage incidents. Our mission is to provide a reliable incident management platform that empowers companies to respond swiftly and effectively when challenges arise. Our innovative approach has established us as leaders in a new multi-billion dollar segment, and we are seeking exceptional talent to help us achieve our ambitious goals. Our customers, including industry giants like NVIDIA, Figma, Canva, and Tripadvisor, trust Rootly for their critical incident management needs. They appreciate our user-friendly platform and unique partnership approach, which has garnered us a stellar 5-star rating on G2. Join us in creating a reliable future for organizations worldwide. Backed by prestigious investors from Y Combinator to key operators in tech, we prioritize transparency and team involvement in our financial health. We conduct monthly business reviews and share updates through our weekly changelog. About the Role As a Senior Site Reliability Engineer at Rootly, you will play a crucial role in shaping our technical infrastructure. You will thrive in a dynamic environment where each day presents new challenges and opportunities for growth. This position is perfect for individuals who seek ownership, enjoy tackling complex technical problems, and are driven by a mission to enhance reliability. While the work will be demanding, it promises to be one of the most rewarding experiences in your career. Collaborate with product teams to enhance the observability, reliability, and performance of services. Take ownership of our CI/CD pipelines, observability tools, monitoring systems, and incident response processes. Develop tools and automation to reduce manual toil, enhance engineering velocity, and improve developer experience and system reliability. Engage deeply with engineering teams to gain insights into system performance and identify cross-functional reliability and scaling concerns. Design and scale our infrastructure while ensuring top-notch performance and operational excellence.
Join Movable Ink as a Product Security Engineer and play a pivotal role in safeguarding our codebases, CI/CD pipelines, and overall development practices. In this hands-on position, you'll adopt a security-first mindset while collaborating with engineering teams to streamline software delivery while minimizing risk. Your expertise will be crucial in enhancing automation processes that protect our code and infrastructure, especially in the face of rising threats from AI coding tools and supply chain attacks. This role is vital for proactively identifying and mitigating vulnerabilities before they are deployed to production.
Full-time|CA$145K/yr - CA$185K/yr|On-site|Movable Ink - Toronto
Join Movable Ink as a Senior Engineer on our Data Platform team, where you will be instrumental in architecting and developing the systems that facilitate data flow across our organization. In this pivotal role, you will contribute to the creation and maintenance of a unified data platform that is responsible for ingesting, processing, and serving substantial volumes of data that underpin our innovative products. Collaborating with cross-functional teams in engineering, analytics, and infrastructure, you will design scalable data ingestion pipelines and backend services that seamlessly integrate data from diverse sources while ensuring reliability, governance, and high availability. Your expertise will guide the modernization of legacy data pipelines towards contemporary architectures, enhancing the cohesion of our products and services in their data access. Your contributions will significantly influence the evolution and enhancement of Movable Ink’s offerings.Key Responsibilities:Contribute to the design and implementation of robust data ingestion pipelines and infrastructure.Facilitate the migration of legacy data management processes to a new platform, ensuring minimal disruption for current data consumers.Establish and monitor SLIs and SLOs for the new data platform, utilizing dashboards and alert alerts to ensure optimal performance.Create a versatile data storage layer that accommodates various use cases, including transactional, analytical, and machine learning workloads.Develop comprehensive monitoring and incident response protocols for all data pipelines and services.Partner with product engineering, analytics, and machine learning teams to define data access strategies.
At Veeva Systems, we are driven by a mission to revolutionize the life sciences industry, empowering companies to bring therapies to patients at an accelerated pace. As one of the fastest-growing SaaS companies in history, we achieved over $2 billion in revenue last fiscal year and possess immense growth potential.Our core values - Do the Right Thing, Customer Success, Employee Success, and Speed - define who we are. In 2021, we made history by becoming a public benefit corporation (PBC), committed to balancing the interests of our customers, employees, society, and investors.As a Work Anywhere organization, we offer the flexibility for you to work remotely or from our office, allowing you to thrive in your preferred environment.Join us in transforming the life sciences sector and making a positive impact on our customers, employees, and communities.
Role Overview Movable Ink is hiring a Senior Data Engineer focused on Event Data for the Toronto office. This role centers on building and maintaining data pipelines that support marketing strategies for leading brands. The work involves modern data technologies and emphasizes both the reliability and accessibility of event data.
About the Role Movable Ink is hiring a Senior Platform Engineer with a focus on Machine Learning for its Toronto office. This role centers on building and improving machine learning features within our marketing technology platform. What You Will Do Work closely with colleagues from different teams to design and implement machine learning solutions that strengthen the platform. Develop and maintain systems that help clients deliver personalized experiences to their audiences at scale. Apply engineering expertise to create reliable, maintainable, and scalable ML infrastructure. Location This position is based in Toronto at Movable Ink.
Full-time|CA$144K/yr - CA$200K/yr|Hybrid|Toronto; Vancouver
The TeamAt MongoDB, our Platform Engineering division within Site Reliability Engineering (SRE) is tasked with managing essential infrastructure and operational functions that empower our engineering teams. This includes our robust, multi-cloud Kubernetes infrastructure, deployment systems, and advanced observability and alerting mechanisms.The Fabric team is at the forefront of enabling secure communication across systems and from the public internet. Our responsibilities involve designing network architecture, implementing service mesh solutions, and optimizing edge load balancing to ensure the safety of customer data in transit. This team is vital in developing and maintaining a dependable and globally connected multi-cloud network that underpins MongoDB products.This position can be based in our Toronto or Vancouver offices, or you can work completely remotely from anywhere in North America. We provide flexible hybrid work arrangements for those in our offices.
Join our innovative team at Newton as a Site Reliability Engineer, where you'll play a crucial role in ensuring the reliability and performance of our systems. In this fully remote position, you will collaborate with engineering and operations teams to develop solutions that enhance system uptime and efficiency.Your expertise will help us transition and maintain our infrastructure, ensuring our services are resilient and scalable. This is an exciting opportunity to contribute to a company that values innovation and teamwork.
Mar 26, 2026
Sign in to browse more jobs
Create account — see all 3,240 results
Tailoring 0 resumes…
Tailoring 0 resumes…
We'll move completed jobs to Ready to Apply automatically.