companyPathAI logo

Senior Staff Site Reliability Engineer - Data Center

PathAIBoston, MA or Remote
Hybrid Full-time $165.8K/yr - $224.4K/yr

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Mid to Senior

Qualifications

What You BringOur ideal candidate possesses a unique blend of skills, including but not limited to:8+ years of relevant experience in Site Reliability Engineering or related fields. A strong proficiency in automation, utilizing scripting, configuration management tools (such as Ansible), and programming languages like Python and GoLang to eliminate manual tasks. Experience in creating monitoring infrastructures with contemporary observability tools such as Datadog, Grafana, or Prometheus. Familiarity with infrastructure as code tools like Terraform and CloudFormation. Hands-on experience managing physical hardware stacks in production environments (e.g., iDRAC, IPMI, Nvidia UFM, Juniper Systems). Expertise in optimizing storage solutions for high-performance workloads, including Quobyte, S3, FSx, and EFS. A solid understanding of modern network designs and a comfort level operating across various network layers.

About the job

Who We Are

At PathAI, we are dedicated to revolutionizing patient outcomes through the power of AI-driven pathology. Our commitment to advancing traditional pathology methodologies into innovative technologies is at the forefront of our mission. By leveraging these advancements, we aim to expedite drug development, enhance diagnostic accuracy, and deliver life-saving treatments to patients with urgency. Join our diverse and talented team, united in solving intricate challenges and making a substantial impact in healthcare.

Where You Fit

We are seeking a highly skilled Senior Staff Site Reliability Engineer who will play a pivotal role in designing, constructing, and managing our hybrid cloud and on-premises environment.

What You’ll Do

In this role, you will harness your extensive skills and develop new ones as you:

  • Elevate our operational practices by implementing Site Reliability Engineering (SRE) best practices focused on user satisfaction, monitoring, and automation.
  • Engineer robust infrastructure patterns for our cloud environments using Amazon Web Services, emphasizing security, reliability, and scalability.
  • Design, construct, and manage our data center to support our rapidly expanding Machine Learning team.
  • Integrate on-premises datacenter environments with our existing cloud infrastructure to create a seamless hybrid cloud solution.
  • Enhance the reliability and resilience of our infrastructure through thorough root-cause analysis and identifying design gaps.
  • Engage in platform on-call rotations and provide assistance during critical incident responses.

About PathAI

PathAI is a pioneering company that is transforming the field of pathology through innovative AI solutions, with a strong commitment to improving patient outcomes and accelerating healthcare advancements.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.