companyZyphra logo

Research Engineer – Audio & Speech Models

ZyphraSan Francisco
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Entry Level

Qualifications

QualificationsCandidates should possess a blend of technical skills and research aptitude, with an emphasis on audio and speech model development.

About the job

Zyphra is an innovative artificial intelligence company located in the heart of San Francisco, California.

The Opportunity:

Join our dynamic team as a Research Engineer - Audio & Speech Models, where you will play a pivotal role in advancing Zyphra’s Audio Team. You will be instrumental in developing cutting-edge open-source text-to-speech and audio models. Your contributions will span the full spectrum of the model training process, from data collection and processing to the design of innovative architectures and training approaches.

Your Responsibilities:

  • Conduct large-scale audio training operations

  • Optimize the performance of our training infrastructure

  • Collect, process, and evaluate audio datasets

  • Implement architectural and methodological improvements through rigorous testing

What We Seek:

  • A strong research mindset with the ability to navigate projects from ideation to implementation and documentation.

  • Proficiency in rapid prototyping and implementation, allowing for swift experimentation.

  • Effective collaboration skills in a fast-paced research environment.

  • A quick learner who is eager to embrace and implement new concepts.

  • Excellent communication abilities, enabling you to contribute to both research and engineering tasks at scale.

Preferred Qualifications:

  • Expertise in training audio models, such as text-to-speech, ASR, speech-to-speech, or emotion recognition.

  • Experience with training audio autoencoders.

  • Solid understanding of signal processing, particularly in audio.

  • Familiarity with diffusion models, consistency models, or GANs.

  • Experience with large-scale (multi-node) GPU training environments.

  • Strong understanding of experimental methodologies for conducting rigorous tests and ablations.

  • Interest in large-scale, parallel data processing pipelines.

  • Competence in PyTorch and Python programming.

  • Experience contributing to large, established codebases with rapid adaptation.

About Zyphra

Zyphra is at the forefront of artificial intelligence innovation, dedicated to creating advanced solutions that transform the way we interact with technology. Our team thrives on collaboration, creativity, and a passion for pushing the boundaries of what's possible.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.