companySpeechify logo

Software Engineer - Data Infrastructure & Acquisition

SpeechifyCardiff, United Kingdom
Remote Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Mid to Senior

Qualifications

Ideal Candidate Profile:Bachelor's, Master's, or PhD in Computer Science or a related field.5+ years of experience in software development. Strong proficiency in bash/Python scripting within Linux environments. Expertise in Docker and Infrastructure-as-Code concepts, with professional experience in at least one major cloud platform (we use GCP). Experience with web crawlers and large-scale data processing workflows is a plus. Ability to manage multiple tasks and adapt to changing priorities effectively. Exceptional communication skills, both written and verbal.

About the job

Speechify aims to remove reading barriers for learners everywhere. More than 50 million people use our text-to-speech tools to turn PDFs, books, Google Docs, news articles, and websites into audio. This helps users read faster, cover more material, and remember more.

Our products include apps for iOS, Android, Mac, a Chrome extension, and a web app. Google named Speechify the Chrome Extension of the Year, and Apple awarded us the 2025 Design Award for Inclusivity.

The Speechify team is fully remote, with nearly 200 people across the globe. Our group includes frontend and backend engineers, AI research scientists, and team members from companies like Amazon, Microsoft, and Google, as well as top PhD programs such as Stanford.

Role Overview

The Data team within Speechify’s AI division is looking for a Software Engineer focused on data infrastructure and acquisition. This position plays a key part in collecting and managing the large-scale datasets that power our model training. The team’s goal is to build high-quality datasets at petabyte scale and keep costs low by tightly integrating infrastructure, engineering, and research.

What You Will Do

  • Find and bring in new audio data sources for the ingestion pipeline.
  • Maintain and grow the cloud infrastructure for data ingestion, currently on Google Cloud Platform (GCP) and managed with Terraform.
  • Work with scientists to improve data cost, throughput, and quality, supporting the creation of advanced models.
  • Partner with the AI team and company leadership to shape the dataset roadmap for future consumer and enterprise products.

Location

This role is based in Cardiff, United Kingdom.

About Speechify

At Speechify, we are dedicated to transforming the reading experience for individuals everywhere. Our innovative text-to-speech technology not only enhances accessibility but also helps users absorb information faster and more effectively. Join a dynamic and diverse team that prioritizes inclusivity and creativity.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.