companySpeechify logo

Software Engineer - Data Infrastructure & Acquisition

SpeechifyAnn Arbor, MI, USA
Remote Full-time $140K/yr - $200K/yr

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Mid to Senior

Qualifications

Qualifications: Bachelor’s, Master’s, or PhD degree in Computer Science or a related field. 5+ years of professional experience in software development. Proficiency in bash and Python scripting, particularly in Linux environments. Strong experience with Docker, Infrastructure-as-Code practices, and at least one major cloud provider (GCP preferred). Familiarity with web crawlers and large-scale data processing workflows is advantageous. Ability to manage multiple priorities and adapt to changing project requirements. Excellent written and verbal communication skills.

About the job

Speechify aims to remove reading as a barrier to learning. More than 50 million people use Speechify to turn text from PDFs, books, Google Docs, and web articles into audio, helping them read faster and retain more. The product suite spans iOS, Android, Mac, a Chrome extension, and web. Speechify has earned recognition from Google as Chrome Extension of the Year and from Apple with the 2025 Design Award for Inclusivity.

The company operates fully remotely, with a team of nearly 200 professionals. Team members include frontend and backend engineers, AI research scientists, and specialists from organizations such as Amazon, Microsoft, Google, Stripe, and Vercel.

Role Overview

Speechify is hiring a Software Engineer focused on data infrastructure and acquisition for the AI team. This engineer will oversee data collection processes used to train models, working closely with both engineers and researchers. The work centers on building and maintaining high-quality datasets at petabyte scale, while keeping infrastructure costs low through thoughtful engineering.

Key Responsibilities

  • Find and source new audio data to improve the ingestion pipeline.
  • Manage and expand cloud infrastructure for data ingestion, currently on Google Cloud Platform (GCP) with Terraform.
  • Work with scientists to optimize data cost, throughput, and quality for next-generation model development.
  • Partner with the AI team and leadership to plan dataset roadmaps for consumer and enterprise products.

Qualifications

  • Bachelor’s, Master’s, or PhD in Computer Science or a related field.
  • At least 5 years of professional experience in software development.
  • Skilled in bash and Python scripting, especially in Linux environments.
  • Hands-on experience with Docker, Infrastructure-as-Code, and a major cloud provider (GCP preferred).
  • Knowledge of web crawlers and large-scale data processing workflows is a plus.
  • Comfortable managing multiple priorities and adapting to shifting requirements.
  • Strong written and verbal communication skills.

Location

Remote. The team is distributed, but this role is listed for Ann Arbor, MI, USA.

About Speechify

Speechify is dedicated to breaking down barriers to reading and learning through innovative text-to-speech technology. With a user base of over 50 million, our products empower individuals to engage with text in a more meaningful way, making it accessible and enjoyable.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.