Qualifications
Key ResponsibilitiesIdentify and source new audio data for integration into our ingestion pipeline. Manage and expand our cloud infrastructure for the ingestion pipeline, currently hosted on GCP and managed through Terraform. Collaborate with scientists to optimize cost, throughput, and quality, enabling the delivery of richer data at larger scales and lower costs to enhance our next-generation models. Work alongside the AI Team and Speechify leadership to develop a dataset roadmap that fuels our future consumer and enterprise products. Ideal Candidate ProfileBachelor's, Master's, or PhD in Computer Science or a related field. Minimum of 5 years of hands-on experience in software development. Strong proficiency in bash and Python scripting within Linux environments. Familiarity with Docker and Infrastructure-as-Code principles, with professional experience in at least one major Cloud Provider (GCP preferred). Experience with web crawlers and large-scale data processing workflows is advantageous. Ability to manage multiple tasks and adapt to shifting priorities. Excellent communication skills, both written and verbal.
About the job
Speechify aims to remove barriers to learning by transforming text into audio. Over 50 million people use Speechify’s text-to-speech tools to listen to PDFs, books, Google Docs, news, and websites. The product suite covers iOS, Android, Mac, Chrome, and web platforms. Google recognized Speechify as Chrome Extension of the Year, and Apple awarded it the 2025 Design Award for Inclusivity.
The company operates fully remotely with a team of nearly 200. Team members include frontend and backend engineers, AI researchers, and professionals from Amazon, Microsoft, Google, Stanford, and founders of successful startups.
Role overview
Speechify is hiring a Software Engineer for the Data Infrastructure & Acquisition team in the AI department. This role centers on managing and improving data collection processes that support model training. The team builds large-scale, high-quality datasets for AI research and development, focusing on both scale and cost efficiency.
Location
Rochester, NY, USA (remote team)
About Speechify
Speechify is dedicated to making reading accessible for everyone. With a user base of over 50 million, our cutting-edge text-to-speech solutions empower individuals to engage with information in an entirely new way. Our fully remote team prides itself on innovation and inclusivity, attracting top talent from leading tech companies and academic institutions.