About the job
Please submit your CV in English and include your English proficiency level.
Toloka AI, through the Mindrift platform, seeks a Senior Python Systems Developer for Functional Testing. This freelance, project-based contract supports leading technology firms with AI system testing and evaluation. The position is fully remote for candidates based in Virginia, United States.
Role overview
This role focuses on designing and executing functional black-box tests for large codebases in a range of programming languages. The work involves maintaining Docker environments to ensure reproducible builds and reliable test execution. Tracking code coverage, configuring automated scoring, and using large language models (LLMs) to automate tasks and enhance code quality are key parts of the job.
What you will do
- Design and run functional black-box tests for codebases written in various languages.
- Set up and maintain Docker environments for consistent builds and test execution.
- Track code coverage and configure automated scoring to align with industry standards.
- Use LLMs (such as Roo Code and Claude Code) to automate repetitive tasks and improve development speed and code quality.
Requirements
- At least 5 years as a Software Engineer, with deep experience in Python.
- Strong knowledge of pytest, including fixtures, session-scoped tests, and timeouts.
- Experience with black-box functional testing for CLI tools.
- Advanced skills with Docker, including writing Dockerfiles, managing user contexts, and securing workspaces.
- Proficiency in Linux and Bash scripting, with the ability to debug inside containers.
- Familiarity with modern Python tools such as uv, pyproject.toml, and packaging.
- Ability to read and analyze code in multiple languages (C, C++, Rust, Go) with LLM support.
- Experience using LLMs (Claude Code, Roo Code, Cursor) to generate test cases and accelerate development.
- English language skills at B2 level or above.
Preferred experience
- Background working with agent evaluation platforms and MCP CLI.
Technologies and tools
Python (pytest, uv, Pillow), Docker, Bash, Git Submodules, C, C++, Rust, Go (for code comprehension), Dagger, GitHub Codespaces, LLMs (Claude Code, Roo Code, Cursor), coverage.py, gcov, kcov.
Project details and compensation
- Freelance, project-based work through the Mindrift platform (powered by Toloka AI).
- Fully remote, with a flexible schedule and workload (20-30 hours per week).
- Rates up to $80 per hour, depending on project scope and experience.
