companySieve logo

Reliability Engineer at Sieve | San Francisco

SieveSan Francisco
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Entry Level

Qualifications

Candidates should possess a robust background in reliability engineering, cloud infrastructure, and incident management. Familiarity with video data and a passion for innovative AI solutions will be advantageous.

About the job

About Sieve

Sieve stands as a pioneering AI research lab dedicated solely to video data. Our innovative approach integrates exabyte-scale video infrastructure with state-of-the-art video understanding techniques and a myriad of data sources, creating unparalleled datasets that redefine video modeling. With video accounting for 80% of global internet traffic, it has become the vital digital medium fueling creativity, communication, gaming, AR/VR, and robotics. At Sieve, we aim to eliminate the most significant bottleneck hindering the expansion of these applications: access to high-quality training data.


With strategic partnerships with leading AI labs, our team of just 12 has achieved remarkable financial success, generating $XXM last quarter alone. Earlier this year, we secured Series A funding from elite firms including Matrix Partners, Swift Ventures, Y Combinator, and AI Grant.


About the Role

As we process petabytes of video across numerous nodes and cloud environments, ensuring reliability, observability, and security is essential to our growth.


We are seeking our inaugural Reliability Engineer, who will focus entirely on fortifying the infrastructure that underpins Sieve. This role demands high ownership and a deep understanding of:

  • System throughput and stability

  • Monitoring and incident management

  • Security principles, including least-privilege design

  • Minimizing operational burdens for the entire engineering team


You will collaborate closely with our CTO and founding engineers to develop the foundational tools that empower our engineering efforts.


This position is ideal for an engineer who is passionate about reliability, throughput, observability, and security. You are proactive in anticipating potential failure modes, reducing operational risks, and designing resilient systems.


If a system failure occurs, you take it personally, thriving under the weight of responsibility.


What You'll Be Doing

  • Collaborate with engineering to design and validate infrastructure supporting PB-scale workloads

  • Develop and manage Terraform-based multi-cloud deployments

  • Enhance cloud and data security (SSO, IAM, least privilege access, auditability)

  • Lead incident response efforts and strengthen systems against failures

  • Create CI/CD systems to minimize user errors and maximize safety

  • Establish monitoring and alerting frameworks (Prometheus, OpenTelemetry, VictoriaMetrics)

About Sieve

Sieve is at the forefront of AI research, dedicated to harnessing the power of video data to create groundbreaking solutions that enhance the digital landscape. Join us as we redefine video modeling and contribute to transformative applications across various industries.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.