companySpeechify logo

Software Engineer, Data Infrastructure & Acquisition

SpeechifyHong Kong, Hong Kong
Remote Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

Ideal Candidate ProfileBS, MS, or PhD in Computer Science or a related discipline. Minimum of 5 years of industry experience in software development. Proficiency in bash/Python scripting within Linux environments. Experience with Docker and Infrastructure-as-Code principles, with professional familiarity with at least one major cloud provider (we use GCP). Experience in large-scale data processing workflows and web crawling is advantageous. Strong multitasking abilities and adaptability to shifting priorities. Exceptional verbal and written communication skills.

About the job

Speechify builds text-to-speech tools that help over 50 million people worldwide turn reading materials, like PDFs, books, Google Docs, news articles, and websites, into audio. Our platform spans iOS, Android, Mac, Chrome, and web. Recent recognition includes Chrome Extension of the Year by Google and Apple’s 2025 Design Award for Inclusivity.

The team brings together nearly 200 people from diverse backgrounds, including engineers, AI researchers, and alumni of companies such as Amazon, Microsoft, and Google. Many hold advanced degrees from places like Stanford. Speechify works fully remotely, without a physical office.

Role Overview

This Software Engineer role focuses on data infrastructure and acquisition within our AI group. The main mission: collect and manage large-scale, high-quality datasets to improve model training. The work blends infrastructure, engineering, and research to support petabyte-scale data growth while keeping costs in check.

What You Will Do

  • Identify and integrate new audio data sources into the ingestion pipeline.
  • Oversee and improve cloud infrastructure for data ingestion, using GCP and Terraform.
  • Partner with scientists to refine cost, throughput, and data quality, enabling richer and larger datasets for future models.
  • Work with the AI team and leadership to shape a dataset roadmap for upcoming consumer and enterprise products.

Location

This position is based in Hong Kong, Hong Kong. The team operates fully remotely.

About Speechify

Speechify is dedicated to breaking down the barriers to reading and learning. With a global user base exceeding 50 million, our advanced text-to-speech technology empowers individuals to consume written content more efficiently. Our award-winning products are designed to enhance accessibility and inclusivity, making learning resources available to everyone, regardless of their reading capabilities.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.