About the job
Speechify’s mission is to remove barriers to reading and make knowledge more accessible. More than 50 million people use Speechify’s text-to-speech tools to convert PDFs, books, Google Docs, news articles, and websites into audio. Our platform helps users read faster, understand more, and remember what matters. Speechify’s product lineup includes iOS, Android, and Mac apps, a Chrome extension, and a web app. Recent honors include Chrome Extension of the Year from Google and Apple’s 2025 Design Award for Inclusivity.
Our fully remote team brings together nearly 200 people worldwide, including engineers, AI researchers, and specialists from companies like Amazon, Microsoft, and Google, as well as alumni of Stanford and high-growth startups such as Stripe, Vercel, and Bolt. We operate without a physical office, collaborating across time zones and backgrounds.
Role Overview
Speechify is hiring a Software Engineer for our data-focused AI team in Madrid, Spain. This role centers on building and maintaining the systems that collect and process the vast audio datasets used to train our models. The work blends infrastructure, engineering, and research, aiming to create high-quality datasets at petabyte scale while controlling costs.
What You Will Do
- Find and connect new audio data sources to our ingestion pipeline.
- Manage and improve our cloud infrastructure for data ingestion, using Google Cloud Platform and Terraform.
- Work alongside scientists to optimize cost, throughput, and data quality, supporting our advanced AI models.
- Collaborate with the AI team and company leadership to shape the dataset roadmap for future consumer and enterprise products.
What We Look For
- BS, MS, or PhD in Computer Science or a related field.
- At least 5 years of professional software development experience.
- Strong skills in bash and Python scripting within Linux environments.
- Professional experience with Docker and Infrastructure-as-Code (such as Terraform), plus hands-on work with at least one major cloud provider (we use GCP).
- Experience with web crawlers and large-scale data processing is a plus.
- Ability to manage multiple tasks and adapt to shifting priorities.
- Clear written and verbal communication skills.

