companySpeechify logo

Software Engineer - Data Infrastructure & Acquisition

SpeechifyMexico City, Mexico
Remote Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

An Ideal Candidate Should HaveBS, MS, or PhD in Computer Science or a related field. A minimum of 5 years of professional software development experience. Proficiency in bash/Python scripting within Linux environments. Expertise in Docker and Infrastructure-as-Code practices, with professional experience in at least one major cloud platform (GCP preferred). Experience with web crawlers and large-scale data processing workflows is advantageous. Ability to manage multiple tasks and adapt to shifting priorities effectively. Strong written and verbal communication skills.

About the job

Speechify’s mission is to remove barriers to learning caused by reading challenges. With a user base of more than 50 million, Speechify turns PDFs, books, Google Docs, news articles, and websites into audio, helping people read faster and retain more. Our text-to-speech technology powers apps for iOS, Android, Mac, Chrome, and the web. Google named our Chrome extension Extension of the Year, and Apple recognized us with a 2025 Design Award.

Speechify operates fully remotely, with no physical office. Nearly 200 team members, including frontend and backend engineers, AI researchers, and industry veterans from Amazon, Microsoft, Google, and more, work together from around the world. Our team also includes graduates of top PhD programs and founders of companies like Stripe, Vercel, and Bolt.

Role Overview

The Data team within our AI division is looking for a Software Engineer focused on data infrastructure and acquisition. This position plays a central role in collecting and managing the data that powers model training. The team’s goal: build and maintain high-quality datasets at petabyte scale while controlling costs through strong infrastructure and engineering practices.

What You’ll Do

  • Find and integrate new sources of audio data into the ingestion pipeline.
  • Manage and improve the cloud infrastructure for data ingestion, using Google Cloud Platform (GCP) and Terraform.
  • Work with scientists to optimize for cost, throughput, and data quality, delivering better datasets to support advanced models.
  • Collaborate with the AI team and company leadership to shape the dataset roadmap for future consumer and enterprise products.

Location

This role is based in Mexico City, Mexico.

About Speechify

Speechify is dedicated to making reading accessible for everyone, offering innovative text-to-speech solutions that empower users to learn more effectively. With a diverse and talented global team, we prioritize inclusivity and excellence in all that we do.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.