About the job
About Voltai
At Voltai, we are pioneering advancements in artificial intelligence by developing sophisticated world models and agents that learn, evaluate, plan, and interact with the physical environment. Our primary focus lies in enhancing hardware capabilities, particularly in electronics systems and semiconductors, where AI can surpass traditional human cognitive limitations in design and creation.
About the Team
Our team comprises exceptional talent, including former Stanford professors, acclaimed SAIL researchers, Olympiad medalists, and industry leaders from renowned companies such as Google, AMD, and Broadcom. We are supported by top investors from Silicon Valley and have a diverse group of experts, including former U. S. government officials, committed to driving innovation in AI and hardware design.
Role Overview
As a Post-Training Research Engineer, you will focus on post-training cutting-edge models to autonomously execute intricate tasks within the semiconductor design and verification pipeline. The models you help develop will optimize chip architectures, refine RTL code, conduct simulations, identify verification gaps, and iteratively enhance designs to expedite semiconductor innovation. You will work alongside leading experts in hardware design and verification, crafting comprehensive reinforcement learning environments that encapsulate the complexities of chip design workflows. Your contributions will involve developing structured reward functions, scaling strategies, and evaluation frameworks aimed at enhancing model reliability, efficiency, and creativity in semiconductor reasoning.
Ideal Candidate Profile
You may excel in this role if you possess experience in:
- Creating and scaling reinforcement learning environments for large language models or multimodal agents.
- Building high-quality evaluation datasets and benchmarks for complex reasoning or design challenges.
- Collaborating closely with domain experts in hardware and verification to establish evaluation metrics, constraints, and simulation conditions.
- Designing reward functions and feedback pipelines that ensure a balance between correctness, performance, and design efficiency.
- Conducting large-scale reinforcement learning fine-tuning or post-training experiments on frontier models.

