About the job
The mission of Speechify is to eliminate reading barriers to learning.
With over 50 million users, Speechify's innovative text-to-speech products transform a variety of texts—PDFs, books, Google Docs, news articles, and websites—into audio, enabling users to read faster and retain more information. Our offerings include an iOS app, Android app, Mac app, Chrome extension, and web app. Recently, Google awarded Speechify the Chrome Extension of the Year, and Apple recognized us with the 2025 Design Award for Inclusivity.
Our fully remote team consists of nearly 200 professionals globally, including frontend and backend engineers, AI researchers, and experts from tech giants like Amazon, Microsoft, and Google, as well as alumni from esteemed institutions like Stanford and founders of high-growth startups.
Overview
We are seeking a talented Software Engineer to join our AI team, focusing on the data infrastructure side. This position will be pivotal in enhancing our model training operations through efficient data collection. Our capability to build high-quality datasets at a petabyte scale, with cost efficiencies, relies on the seamless integration of infrastructure, engineering, and research.
What You’ll Do
- Identify and source new audio data to enrich our ingestion pipeline.
- Manage and expand our cloud infrastructure on GCP, utilizing Terraform for management.
- Work closely with our scientists to optimize cost, throughput, and data quality for next-generation models.
- Collaborate with the AI team and Speechify leadership to develop a comprehensive dataset roadmap to enhance our consumer and enterprise offerings.
An Ideal Candidate Should Have
- A BS, MS, or PhD in Computer Science or a related field.
- A minimum of 5 years of professional software development experience.
- Expertise in bash/Python scripting within Linux environments.
- Proficiency in Docker and Infrastructure-as-Code, with experience in at least one major cloud provider (we use GCP).
- Familiarity with web crawlers and large-scale data processing workflows is a plus.
- Strong multitasking abilities and adaptability to shifting priorities.
- Excellent written and verbal communication skills.

