About the job
We invite you to submit your CV in English, including your level of English proficiency.
Mindrift connects talented professionals with project-based AI opportunities at leading technology companies, focusing on the testing, evaluation, and enhancement of AI systems. Note that participation is project-based and does not constitute permanent employment.
Opportunity Overview:
As a Senior Evaluation Engineer, you will develop challenging coding test cases designed to rigorously test AI coding systems:
- Review and enhance realistic coding tasks derived from existing production codebases, ensuring they possess an achievable scope, clear requirements, and relevant information sources.
- Create comprehensive functional tests that accurately validate end-to-end behaviors and edge cases, going beyond superficial checks.
- Design challenging yet fair tasks where AI has access to all necessary context but must engage in complex reasoning to succeed (e.g., information dispersed across files and external sources).
- Analyze AI failures to discern between areas where the model struggles versus areas of strength.
- Iterate on your work based on feedback from expert QA reviewers who assess your submissions against seven quality criteria.
Desired Qualifications:
This role is ideal for experienced developers, software engineers, and test automation specialists who are interested in part-time, non-permanent projects. Candidates should possess:
- A degree in Computer Science, Software Engineering, or a related field.
- Over 5 years of experience in software development, primarily using Python (pytest, async/await, subprocess, file operations).
- A background in Full-Stack development, with a balanced focus on creating React-based interfaces and robust back-end systems.
- Experience in writing tests (functional, integration—not just executing them).
- Familiarity with Docker containers (for running evaluations locally within containers).
- An understanding of CI/CD processes (using GitHub Actions: triggers, labels, interpreting results).
- English proficiency at a B2 level or higher.
Application Process:
Apply → Pass qualifications → Join a project → Complete tasks → Get compensated.
Effort Expectation:
The tasks for this project are estimated to require approximately 20 hours to complete, depending on complexity. This is merely an estimate, and you have the flexibility to choose your working hours. Submissions must meet the deadline and adhere to the specified acceptance criteria to be accepted.
Compensation:
- Paid contributions, with rates up to $30/hour*.
- Fixed project rates or individual rates depending on the project.
- Some projects may include incentive payments.
*Note: Rates are determined based on expertise, skills assessment, location, project requirements, and other factors. Higher rates may be offered to highly specialized experts. Lower rates may apply during onboarding or non-core project phases. Payment details will be provided for each project.

