companyCoreWeave logo

Operations Engineer - Fleet Reliability

CoreWeavePoland
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

Key Responsibilities:Configure and maintain large-scale, high-performance supercomputing clusters powered by cutting-edge GPUs. Troubleshoot hardware and software issues; collaborate with data center, network, hardware, and platform teams to ensure resolution. Monitor and analyze system performance, implementing necessary remediation actions to maintain cloud health.

About the job

CoreWeave is revolutionizing the cloud landscape with our platform, designed specifically for AI innovators. Our infrastructure is trusted by leading AI labs, startups, and enterprises, empowering them to scale their AI initiatives with confidence. Established in 2017 and publicly traded (Nasdaq: CRWV) since March 2025, CoreWeave blends high-performance computing with expert technical support to drive innovation. Explore our offerings at www.coreweave.com.

We take pride in being a Living Wage accredited employer.

We are looking for motivated individuals willing to work two shifts from 7 AM to 9 PM.

Selected candidates will undergo onboarding training at our US headquarters for up to 2 weeks within their first month of employment.

The Fleet Reliability Operations team is integral to the daily provisioning, management, and uptime of CoreWeave's expansive fleet of server nodes. This team plays a crucial role in our growth strategy, actively involved in the configuration, updates, and remote troubleshooting of our high-performance supercomputing clusters, including their networking and tool dependencies. Your mission will be to combat the forces of entropy, maximizing the operational capacity of CoreWeave's offerings to our customers.

We are seeking curious, creative, and tenacious problem solvers to enhance our Fleet Reliability Operations team. As part of our dedicated engineering team, you will work diligently to deploy nodes swiftly and efficiently. You will be instrumental in driving batches of server nodes through our provisioning and validation processes while effectively troubleshooting issues that arise.

About CoreWeave

CoreWeave is the essential cloud for AI, providing a robust platform and expert support to help innovators build and scale their AI projects. Our commitment to quality and performance makes us a trusted partner for businesses around the globe.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.