About the job
Speechify’s mission is to remove reading barriers and expand access to learning for everyone. More than 50 million people use Speechify’s text-to-speech tools to convert PDFs, books, Google Docs, news articles, and websites into audio, helping users read faster and remember more. Our product suite includes iOS, Android, Mac, Chrome extension, and web apps, and has earned honors such as Google’s Chrome Extension of the Year and Apple’s 2025 Design Award for Inclusivity.
Speechify operates as a fully distributed team with nearly 200 professionals. Team members bring experience from Amazon, Microsoft, Google, Stanford, Stripe, Vercel, and other leading organizations. Our group includes frontend and backend engineers, AI research scientists, and talent from top academic programs and startups.
Role Overview
Speechify is looking for a Software Engineer focused on data infrastructure and acquisition to join the AI team. This engineer will oversee the collection of data used for model training, helping to build and maintain high-quality datasets at petabyte scale. The role bridges infrastructure, engineering, and research to ensure efficient data operations.
Key Responsibilities
- Identify new sources of audio data and connect them to the ingestion pipeline.
- Manage and improve the cloud infrastructure supporting data ingestion, currently running on Google Cloud Platform (GCP) and managed with Terraform.
- Work with scientists to optimize cost, throughput, and data quality, enabling larger datasets at lower costs for next-generation models.
- Collaborate with the AI team and company leadership to shape the dataset roadmap for upcoming consumer and enterprise products.
Qualifications
- BS, MS, or PhD in Computer Science or a related field.
- Minimum 5 years of professional software development experience.
- Strong skills in bash and Python scripting in Linux environments.
- Deep understanding of Docker and Infrastructure-as-Code, with hands-on experience in at least one major cloud platform (GCP preferred).
- Background with web crawlers and large-scale data processing is a plus.
- Comfort managing multiple priorities and responding to changing needs.
- Excellent written and spoken communication skills.
Location
This position is open to candidates in Charlotte, NC, USA.

