companySpeechify logo

Software Engineer, Data Infrastructure & Acquisition

SpeechifyColumbus, OH, USA
Remote Full-time $140K/yr - $200K/yr

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Mid to Senior

Qualifications

Qualifications: BS/MS/PhD in Computer Science or a related field. 5+ years of industry experience in software development. Proficiency with bash/Python scripting in Linux environments. Proficiency in Docker and Infrastructure-as-Code concepts and professional experience with at least one major Cloud Provider (we use GCP). Experience with web crawlers and large-scale data processing workflows is a plus. Ability to handle multiple tasks and adapt to changing priorities. Strong communication skills, both written and verbal.

About the job

Speechify’s mission is to remove reading as a barrier to learning. Over 50 million people use our text-to-speech tools to turn books, PDFs, Google Docs, news, and websites into audio. Our products include iOS and Android apps, a Mac app, a Chrome extension, and a web app. Speechify has earned recognition from Google as Chrome Extension of the Year and received Apple’s 2025 Design Award for Inclusivity.

Our fully distributed team includes nearly 200 professionals, from engineers to AI researchers, with backgrounds at Amazon, Microsoft, Google, Stripe, Vercel, and Bolt. Collaboration happens across time zones and continents.

Role Overview

This Software Engineer, Data Infrastructure & Acquisition role sits within the AI team. The focus: oversee data collection and infrastructure to support model training. The team builds large, high-quality datasets at petabyte scale while keeping costs low. The work blends infrastructure, engineering, and research. Candidates who care deeply about data engineering and AI will find meaningful challenges here.

What You Will Do

  • Find and evaluate new audio data sources, then integrate them into the ingestion pipeline.
  • Maintain and scale cloud infrastructure for data ingestion, using GCP and Terraform.
  • Work with scientists to balance cost, throughput, and data quality, delivering better datasets for model development.
  • Partner with the AI team and leadership to shape a dataset roadmap for future Speechify products.

What We Look For

  • Bachelor’s, Master’s, or PhD in Computer Science or a related field.
  • At least 5 years of software development experience.
  • Strong skills in bash and Python scripting in Linux environments.
  • Hands-on experience with Docker and Infrastructure-as-Code (such as Terraform), plus experience with a major cloud provider (GCP preferred).
  • Familiarity with web crawlers and large-scale data processing is a plus.
  • Comfort managing multiple priorities and shifting demands.
  • Clear communication skills, both written and spoken.

Location: Columbus, OH, USA (remote team, distributed worldwide).

About Speechify

Speechify is dedicated to ensuring that reading is never a barrier to learning. With over 50 million users and multiple award-winning products, we are a leader in the text-to-speech industry. Our fully remote team fosters collaboration and innovation, helping us to continually enhance our offerings and reach new heights in making information accessible to everyone.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.