companyServe Robotics logo

Senior Reliability Operations Engineer - Remote (Mexico)

Serve RoboticsMexico City, MX (remote)
Remote Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Senior

Qualifications

QualificationsExtensive experience in incident management and response. Strong understanding of cloud systems and robotic technologies. Proficient in creating runbooks and operational documentation. Experience with automation scripting and tools. Familiarity with metrics, logs, and monitoring tools.

About the job

At Serve Robotics, we are transforming urban mobility with our innovative sidewalk robot, which embodies our vision for a future where deliveries are efficient and accessible. Our robots are designed to navigate congested streets, making deliveries available to a broader audience and supporting local businesses.

The Serve fleet has been successfully delivering joy to merchants, customers, and pedestrians across major cities like Los Angeles, Miami, Dallas, Atlanta, and Chicago. We are looking for passionate individuals who can help evolve robotic deliveries from a fascinating novelty to a seamless, everyday occurrence.

About Us

We are a team of seasoned professionals from the tech industry, specializing in software, hardware, and design, united in our mission to create the future we envision. Our focus is on addressing real-world challenges using robotics, machine learning, and computer vision, all while ensuring an exceptional user experience. Our diverse and agile team thrives on collaboration and respect, believing that complex problems are best solved together.

Role Overview

As the Senior Reliability Operations Engineer, you will be pivotal in enhancing operational reliability across our regional operations. You will oversee incident response, manage escalations, and provide Tier 2 support for both robotic and cloud systems. This role involves developing and refining runbooks, automations, and operational processes, working closely with product engineering and Site Reliability Engineers (SREs). You will act as the regional incident lead, ensuring timely resolution of issues and clear communication with all stakeholders.

Your Responsibilities

  • Act as the primary incident lead during your region's operational hours, coordinating technical investigations, centralizing communication, and engaging relevant engineering and SRE teams for escalations.

  • Address escalations from Tier 1 support, utilizing runbooks, metrics, logs, and system diagnostics to troubleshoot and resolve issues or escalate to Tier 3 as necessary.

  • Create and maintain runbooks, workflows, and operational documentation to ensure consistent responses to recurring issues, collaborating with product teams to enhance coverage over time.

  • Develop, maintain, and enhance automation scripts and tools to streamline common remediation processes, improving response times and minimizing manual operational tasks.

  • Utilize metrics, logs, and tracing tools (like Grafana/Prometheus, GCP Monitoring, OpenTelemetry) to proactively identify issues, validate system behavior, and drive continuous improvement in detection methods.

About Serve Robotics

Serve Robotics is at the forefront of revolutionizing urban logistics through our friendly sidewalk robots, designed to enhance delivery efficiency while easing street congestion. Our innovative approach is making a significant impact in major cities across the United States. Join us as we lead the way in robotic delivery solutions that benefit communities and businesses alike.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.