About the job
Charter:
Join a pioneering team dedicated to developing the first precise AI systems for predicting drug toxicity, aiming to replace traditional lab and animal experiments.
About Axiom and the role:
At Axiom, we are committed to creating advanced AI systems for drug safety and toxicity evaluation. Drug toxicity is a leading cause of failure in drug development, and by addressing this challenge, we can accelerate the delivery of innovative medicines to patients. We are seeking a passionate Data Engineer to lead the design and implementation of pipelines, systems, and tools that transform raw chemical, biological, and clinical data into machine learning-ready datasets and actionable insights for our clients. This role involves close collaboration with our machine learning, laboratory, and product teams to develop literature research and data platforms driven by large language models (LLMs), enhance the inference capabilities of image and graph neural networks, automate ETL processes from various sources, and uphold the integrity of critical datasets that inform key decisions within the organization.
What we are looking for:
We aim to attract individuals who inspire and elevate our entire team. Ideal candidates will bring high energy, autonomy, and a discerning eye for what truly matters. They should embody a proactive mindset that continuously identifies necessary actions and executes them. Candidates must possess technical excellence, a deep mastery of their craft, and an insatiable curiosity that keeps them at the cutting edge of technology while effectively interfacing across AI, engineering, product, biology, chemistry, and business domains. Although they may have experience in larger tech firms, they seek a challenging adventure that promises significant rewards and satisfaction at the end.
What you will be doing:
Design and maintain Axiom's core research platform data systems, encompassing ingestion, processing, storage, and serving capabilities.
Collaborate with scientists to define their data requirements and develop streamlined APIs for accessing chemical and biological datasets.
Architect LLM systems to curate, cleanse, and analyze human clinical trial data, ensuring proper evaluations and observability for these systems.
Create distributed systems for executing large-scale LLM tasks that clean and curate biological and clinical data.
Establish quality assurance measures, testing tools, and monitoring systems to maintain the accuracy and reliability of data and model outputs.

