About the job
Join our innovative team at Turnitin, where Machine Learning is pivotal to our ongoing success. We have an ambitious product roadmap, and you will be a part of a global team of inquisitive and dedicated scientists and engineers, all committed to delivering state-of-the-art, well-structured Machine Learning systems. Collaborate closely with product and engineering teams across Turnitin to seamlessly integrate Machine Learning into a diverse array of learning, teaching, and integrity products.
With Machine Learning being utilized by countless instructors teaching millions of students globally, your contributions will have far-reaching impacts. Our platform has processed billions of submissions, and your work will enhance our AI Writing detection system, automate feedback on student writing, explore authorship, and transform assessment creation and grading processes, among other vital back-end functions.
Key Responsibilities:
You will be part of an applied science group focused on modern Deep Learning. As a Senior Machine Learning Scientist, you will possess a balanced skill set encompassing both the scientific and software engineering aspects of (Deep) Machine Learning. Your primary focus will be on developing novel and deployable ML models and solutions in scenarios where existing solutions may not suffice. A strong understanding of the mathematical foundations of machine learning and deep neural networks is crucial, enabling you to devise novel model architectures, loss functions, and training methods. Staying updated with the latest AI and Deep Learning research across various modalities is essential, as you will apply these insights to your work.
Although we utilize established training platforms, you will also create custom training loops. Models must be deployable in our products, necessitating proficiency in production-level coding and software engineering. You may train large models (with hundreds of billions of parameters), requiring expertise in multi-GPU and node training and familiarity with the latest training and inferencing advancements. Models must excel in production, balancing accuracy and computational costs effectively. Your role will routinely involve dataset exploration, generation (including synthetic datasets), design, construction, and analysis, which may occupy a significant portion of your time. Given the scale of data (billions of samples), the ability to write parallel and efficient pipelines is essential. Additionally, you will engage in code and model maintenance, hardening models for production pipelines, developing demos, and presenting your findings.

