companyFluidstack logo

Network Engineer - Reliability & Observability at Fluidstack | New York, NY

FluidstackNew York, NY
On-site Full-time $150K/yr - $250K/yr

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

Qualifications- Proven experience as a Network Engineer or in a similar role with a focus on reliability and observability.- Strong background in Quality Assurance, data collection, and network reliability metrics.- Familiarity with both hardware (electronics, optics) and software development practices.- Experience with data analysis and the utilization of metrics to inform operational strategies.- Excellent problem-solving and analytical skills, coupled with the ability to work collaboratively across teams.

About the job

Join Fluidstack as a Network Engineer!

At Fluidstack, we are at the forefront of building cutting-edge infrastructure designed for abundant intelligence. Collaborating with leading AI labs, government entities, and major enterprises such as Mistral, Poolside, and Meta, we strive to deliver compute capabilities at unprecedented speeds.

Our mission is to accelerate the realization of Artificial General Intelligence (AGI). We are urgently seeking passionate individuals who are committed to delivering exceptional infrastructure. At Fluidstack, we take pride in our work, treating our customer outcomes as our own. If you are driven by purpose and excellence, and ready to put in the effort necessary to shape the future of intelligence, we invite you to join us!

Position Overview

Fluidstack is on the lookout for a Network Engineer specializing in Reliability & Observability. In this pivotal role, you will act as a reliability engineer, leading the charge in developing processes, collecting data, and establishing reliability metrics aimed at enhancing the quality and dependability of AI networks throughout all operational phases.

Your primary focus will be on creating systems, tools, and data pipelines to boost network quality, while also automating metrics reporting (24/7) and generating periodic reliability assessments for both internal teams and customers.

This position is perfect for seasoned network operators who possess a deep passion for reliability and have experience in designing and implementing full lifecycle software, including conducting Quality Assurance audits and analyzing failure rates. A strong interest in hardware (both electronics and optics) and software development is essential, alongside a commitment to leveraging data for informed decision-making in deployment and operations.

We encourage experienced Site Reliability Engineers (SREs) with a strong networking background to apply!

Key Responsibilities

  • Quality Assurance Ownership: Design and implement QA processes tailored for network hardware and networks.

  • Data Pipelines: Develop and deploy both serverless and manually triggered workflows to generate network quality and reliability observability for our clients.

  • Deployment and Operations Assistance: Collaborate with various teams to support full lifecycle data collection, analysis, and process enhancements aimed at meeting service level agreements (SLAs) and objectives (SLOs).

  • Process Engineering: Innovate and implement process improvements to streamline deployment and operational workflows.

About Fluidstack

Fluidstack is revolutionizing the future of intelligent infrastructure, partnering with top-tier AI labs and enterprises. Our commitment to excellence and customer-centered outcomes drives our mission to unlock the full potential of compute capabilities.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.