About the job
About Basis
Basis is a pioneering nonprofit organization focused on applied AI research, driven by dual objectives.
Our first goal is to comprehend and construct intelligence. We endeavor to lay down the mathematical principles defining reasoning, learning, decision-making, understanding, and explanation, while also developing software that embodies these principles.
The second aim is to enhance society's capability to tackle complex challenges. We strive to broaden the scale, complexity, and diversity of problems we can address today, and more critically, to hasten our capacity to resolve future challenges.
To fulfill these ambitions, we are creating an innovative technological foundation inspired by human reasoning, alongside a collaborative organization that prioritizes human values.
About the Role
As a Data Engineer on the Platform team at Basis, you will be responsible for constructing reliable data pipelines featuring comprehensive provenance and quality controls. You will curate documented datasets for training and evaluation while ensuring that our data infrastructure scales effectively. Your work will encompass both platform-specific data needs and cross-project data coordination, minimizing redundancy and fostering shared datasets.
We are in search of technically proficient individuals who regard data quality as paramount. The ideal candidate has experience with machine learning data pipelines, understands the complete lifecycle from raw data to model training and evaluation, and approaches data provenance, lineage tracking, and quality assurance with rigor. You will blend software engineering best practices with a profound understanding of data systems and machine learning requirements.
This role operates across both the Platform and Research teams, working on infrastructure that supports our commercial offerings and internal research initiatives. You will play a vital role in scaling Basis's data operations to support medium-scale models, ensuring data governance as we cater to external clients, and building systems that researchers can rely on for reproducible experiments.
We seek individuals who are committed to executing high-quality, robust data engineering while remaining open to iteration, learning from real-world usage, and exploring diverse approaches to achieve excellence.
At Basis, collaboration is key—both internally and with our external partners. We value those who enjoy laying the groundwork for solving grand challenges that extend beyond individual efforts.

