About the job
Please submit your CV in English, including your English proficiency level.
Mindrift is dedicated to connecting talented specialists with project-based AI opportunities from leading technology firms, focusing on the testing, evaluation, and enhancement of AI systems. Please note that participation is project-based and does not constitute permanent employment.
About the Role
As a Freelance AI Evaluation Engineer, you will design and develop challenging coding test cases that rigorously assess AI coding systems:
- Review and enhance realistic coding tasks derived from provided production codebases, ensuring they have realistic scope and requirements.
- Develop comprehensive functional tests that validate end-to-end behavior and edge cases, going beyond superficial checks.
- Create “fair yet challenging” tasks where the AI has all the necessary context but must work to retrieve it (information spread across files and external resources, requiring complex reasoning).
- Analyze AI failures to identify specific challenges the model faces versus areas of proficiency.
- Refine your work based on feedback from expert QA reviewers who evaluate your submissions against seven quality criteria.
Qualifications
This role is ideal for seasoned developers, software engineers, or test automation specialists interested in part-time, non-permanent projects. Preferred candidates will possess:
- A degree in Computer Science, Software Engineering, or a related field.
- 5+ years of experience in software development, with a strong focus on Python (pytest, async/await, subprocess, file operations).
- A solid background in Full-Stack development, adept at building React-based interfaces and robust back-end systems.
- Experience in writing tests (functional, integration—not just executing them).
- Proficiency with Docker containers (running evaluations locally in containers).
- Understanding of CI/CD processes (experience with GitHub Actions as a user: triggers, labels, reading results).
- English proficiency at a B2 level or higher.
How It Works
Apply → Pass qualifications → Join a project → Complete tasks → Get paid.
Effort Estimation
Tasks for this project are estimated to require approximately 20 hours, depending on complexity. This is merely an estimate; you have the flexibility to choose when and how to work. Tasks must be submitted by the deadline and meet specified acceptance criteria to be accepted.
Compensation
On this project, contributors can earn up to $50 per hour, depending on their experience and contribution pace. Compensation may vary across projects based on scope, complexity, and required expertise. Please be aware that other projects on the platform may offer different earning levels based on their requirements.

