About the job
Please submit your CV in English and specify your English proficiency level.
This is a freelance, project-based position with Toloka AI, connecting through the Mindrift platform. Assignments focus on testing, evaluating, and improving AI systems for technology companies. This role is not a permanent employment contract.
Role overview
The Senior Python Systems Developer - Functional Testing Expert works fully remotely from Sweden. The main focus is on large-scale codebases, reproducible Docker environments, and leveraging LLMs such as Roo Code and Claude Code to streamline both development and testing. Projects require hands-on work with functional testing, automation, and code quality improvement.
What you will do
- Design and implement functional black box tests for codebases in multiple languages.
- Set up and maintain Docker environments to ensure reproducible builds and consistent testing across platforms.
- Monitor code coverage and configure automated scoring systems to meet industry standards.
- Use LLMs (Roo Code, Claude) to automate repetitive tasks, accelerate development, and improve code quality.
Requirements
- Minimum 5 years as a Software Engineer, with deep experience in Python.
- Advanced skills with pytest, including fixtures, session-scoped tests, and timeouts. Strong background in black-box functional testing for CLI tools.
- Expertise in Docker: writing reproducible Dockerfiles, managing user contexts, and securing workspaces.
- Strong command of Linux and Bash scripting, including debugging inside containers.
- Familiarity with modern Python tools such as uv, pyproject.toml, and packaging.
- Ability to read and understand code in C, C++, Rust, and Go, using LLMs as needed.
- Experience using LLMs (Claude Code, Roo Code, Cursor) to speed up development and generate test cases.
- English proficiency at B2 level or above.
Preferred qualifications
- Experience with agent evaluation platforms and MCP CLI.
Tools and technologies
Python (pytest, uv, Pillow), Docker, Bash, Git Submodules, C/C++/Rust/Go (reading), Dagger, GitHub Codespaces, LLMs (Claude Code, Roo Code, Cursor), coverage.py, gcov, kcov.
Engagement and compensation
- Freelance, project-based work through Mindrift (powered by Toloka AI).
- Fully remote with flexible scheduling. Typical commitment is 20-30 hours per week.
- Compensation varies by project scope and expertise. AI trainers can earn up to $50 per hour.
