companySpeechify logo

Software Engineer, Data Infrastructure & Acquisition

SpeechifyCharlotte, NC, USA
Remote Full-time $140K/yr - $200K/yr

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Mid to Senior

Qualifications

An Ideal Candidate Should Have BS, MS, or PhD in Computer Science or a related discipline. At least 5 years of professional experience in software development. Strong proficiency in bash and Python scripting within Linux environments. Extensive knowledge of Docker and Infrastructure-as-Code principles, with professional experience in at least one major cloud provider (we use GCP). Experience with web crawlers and large-scale data processing workflows is advantageous. Ability to manage multiple priorities and adapt to evolving needs. Exceptional written and verbal communication skills.

About the job

Speechify’s mission is to remove reading barriers and expand access to learning for everyone. More than 50 million people use Speechify’s text-to-speech tools to convert PDFs, books, Google Docs, news articles, and websites into audio, helping users read faster and remember more. Our product suite includes iOS, Android, Mac, Chrome extension, and web apps, and has earned honors such as Google’s Chrome Extension of the Year and Apple’s 2025 Design Award for Inclusivity.

Speechify operates as a fully distributed team with nearly 200 professionals. Team members bring experience from Amazon, Microsoft, Google, Stanford, Stripe, Vercel, and other leading organizations. Our group includes frontend and backend engineers, AI research scientists, and talent from top academic programs and startups.

Role Overview

Speechify is looking for a Software Engineer focused on data infrastructure and acquisition to join the AI team. This engineer will oversee the collection of data used for model training, helping to build and maintain high-quality datasets at petabyte scale. The role bridges infrastructure, engineering, and research to ensure efficient data operations.

Key Responsibilities

  • Identify new sources of audio data and connect them to the ingestion pipeline.
  • Manage and improve the cloud infrastructure supporting data ingestion, currently running on Google Cloud Platform (GCP) and managed with Terraform.
  • Work with scientists to optimize cost, throughput, and data quality, enabling larger datasets at lower costs for next-generation models.
  • Collaborate with the AI team and company leadership to shape the dataset roadmap for upcoming consumer and enterprise products.

Qualifications

  • BS, MS, or PhD in Computer Science or a related field.
  • Minimum 5 years of professional software development experience.
  • Strong skills in bash and Python scripting in Linux environments.
  • Deep understanding of Docker and Infrastructure-as-Code, with hands-on experience in at least one major cloud platform (GCP preferred).
  • Background with web crawlers and large-scale data processing is a plus.
  • Comfort managing multiple priorities and responding to changing needs.
  • Excellent written and spoken communication skills.

Location

This position is open to candidates in Charlotte, NC, USA.

About Speechify

Speechify is dedicated to making reading an accessible and enriching experience for everyone. Our innovative text-to-speech products have helped millions turn their reading into audio, fostering better comprehension and retention.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.