companySpeechify logo

Software Engineer - Data Infrastructure & Acquisition

SpeechifySan Diego, CA, USA
Remote Full-time $140K/yr - $200K/yr

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Mid to Senior

Qualifications

Ideal Candidate RequirementsBS, MS, or PhD in Computer Science or a related field. A minimum of 5 years of software development experience. Proficiency in bash and Python scripting within Linux environments. Experience with Docker, Infrastructure-as-Code practices, and a major Cloud Provider (GCP is preferred). Familiarity with web crawlers and large-scale data processing workflows is a plus. Adept at multitasking and adjusting to shifting priorities. Excellent written and verbal communication skills.

About the job

Speechify builds text-to-speech tools that help over 50 million people access content in new ways. Our products convert reading materials, PDFs, books, Google Docs, news articles, and websites, into audio, supporting users to read efficiently and retain more. We offer mobile apps for iOS and Android, a Chrome Extension, and a Web App. Our work has earned recognition from Google (Chrome Extension of the Year) and Apple (2025 Design Award for Inclusivity).

Our fully remote team includes nearly 200 professionals worldwide, with backgrounds at Amazon, Microsoft, Google, and top universities such as Stanford. We bring together frontend and backend engineers, AI researchers, and others passionate about accessible technology.

Role Overview

The Software Engineer - Data Infrastructure & Acquisition will join Speechify’s AI team, focusing on the Data division. This role centers on building and maintaining large-scale data collection systems that support model training. The work blends infrastructure, engineering, and research to deliver high-quality datasets at petabyte scale while controlling costs.

What You Will Do

  • Identify and source new audio data, integrating it into Speechify’s ingestion pipeline.
  • Manage and improve cloud infrastructure for the data ingestion pipeline, currently on Google Cloud Platform and managed with Terraform.
  • Partner with scientists to optimize for cost, throughput, and data quality, supporting next-generation model development.
  • Work with the AI team and company leadership to shape the dataset roadmap for upcoming consumer and enterprise products.

Location

This position is based in San Diego, CA, USA. The team operates fully remotely.

About Speechify

Speechify is dedicated to transforming the way people consume written content. Our award-winning text-to-speech products empower users to learn and absorb information without the constraints of traditional reading methods. By fostering a fully remote work environment, we attract top talent from across the globe, driving innovation and inclusivity in education.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.