companyRecraft logo

Machine Learning Data Engineer

RecraftLondon, UK
On-site Full-time

Clicking Apply Now takes you to AutoApply where you can tailor your resume and apply.


Unlock Your Potential

Generate Job-Optimized Resume

One Click And Our AI Optimizes Your Resume to Match The Job Description.

Is Your Resume Optimized For This Role?

Find Out If You're Highlighting The Right Skills And Fix What's Missing

Experience Level

Experience

Qualifications

QualificationsEssential RequirementsProficient in Python programming, with a knack for writing clean, maintainable, and production-ready code. Experience with data pipelines and data engineering practices. Familiarity with cloud platforms and tools, particularly Kubernetes and S3 storage systems. Strong analytical skills and a problem-solving mindset. Ability to work collaboratively within a team and communicate effectively.

About the job

About Recraft

Founded in the US in 2022 and now thriving in London, UK, Recraft is revolutionizing the creative landscape with an innovative AI tool designed for professional designers, illustrators, and marketers. We are committed to setting a new benchmark in the realm of image generation.

Our platform empowers creators to swiftly generate and refine original images, vector art, illustrations, icons, and 3D graphics using AI technology. With over 3 million users across 200 countries producing hundreds of millions of stunning images, Recraft is just getting started in its journey to redefine creativity.

Join us in a world of professional growth, contribute to large-scale projects, and help shape the future of creativity. Our mission is clear: to make Recraft an indispensable tool for every designer and to elevate the industry standard. We are dedicated to ensuring that creators retain full control over their creative processes, providing them with cutting-edge tools to transform their ideas into reality.

If you are driven by the passion to push the boundaries of AI, we invite you to join our team!

Position Overview

As a part of Recraft, you will play a pivotal role in developing the next generation of generative models for images and text. We are seeking an ML Data Engineer to enhance our data pipelines for unstructured data, primarily focusing on images. Your work will ensure our training workflows are efficient, reliable, and scalable. You will design and manage high-throughput ingestion and preprocessing on Kubernetes, evolve our internal data-pipeline framework, and collaborate closely with ML engineers to deliver datasets that significantly improve model quality.

Key Responsibilities

  • Design and maintain robust data-ingestion pipelines to source and prepare large-scale datasets of images (and occasionally text/HTML) from open, publicly accessible, and authorized sources.
  • Manage the complete data flow: from raw data to quality filtering, deduplication, validation, and creation of training-ready artifacts. Enhance our Kubernetes-based data-pipeline framework, including distributed job handling, retries, monitoring, and automation.
  • Utilize S3-style object storage for efficient data layouts, lifecycle management, throughput optimization, and cost considerations.
  • Implement additional tools for pipeline observability, including progress tracking, health visualization, performance metrics, and alert systems to facilitate rapid iteration.
  • Collaborate intimately with ML engineers to align datasets with training requirements and expedite experimentation processes.

About Recraft

Recraft is an innovative AI-driven platform that empowers professional creators by providing them with tools to generate and refine high-quality visual content quickly and efficiently. With a strong commitment to excellence and user experience, Recraft is reshaping how designers and marketers approach creativity.

Similar jobs

Tailoring 0 resumes

We'll move completed jobs to Ready to Apply automatically.