About the job
About Us:
Calico Life Sciences LLC, a subsidiary of Alphabet, is at the forefront of research and development, focusing on understanding the biology of human aging through advanced technologies and innovative model systems. Our mission is to leverage this knowledge to create interventions that promote longer, healthier lives. With cutting-edge technology labs, a commitment to curiosity-driven scientific discovery, and a dynamic drug-development pipeline in collaboration with academic and industry partners, Calico is a vibrant environment for catalyzing medical breakthroughs.
Role Overview:
We are on the lookout for a Senior Data Engineer to become a key member of our collaborative Engineering team and lead the establishment of the Drug Discovery Data Engineering group. The ideal candidate will be an enthusiastic team player, meticulous, highly organized, and adept at navigating intricate data, software, and scientific challenges.
In this role, you will serve as a crucial technical liaison among our Medicinal Chemistry, Automation, Machine Learning, Assay Technology, and Protein Sciences teams. You will oversee projects from initial requirements through to production deployment, engineering high-performance data systems that integrate with our molecular databases (CDD Vault), inventory systems (Mosaic), electronic lab notebooks (Benchling), our internal data warehouse (BigQuery), and our proprietary AI platform. As the inaugural hire for this team, your contributions will be instrumental in shaping data flows, developing web applications for stakeholder engagement, and establishing a progressive engineering culture in this vital growth sector.
Key Responsibilities:
- Project Leadership: Collaborate with scientists across Assay Technology, Medicinal Chemistry, and Protein Sciences to gather requirements, design solutions, and implement production-grade software that enhances data movement and analysis.
- System Integration: Create and deploy effective integrations between internal pipelines and third-party platforms, particularly involving the CDD molecular database, Mosaic inventory systems, and Benchling ELN.
- Data Flow Optimization: Develop and refine data flows organization-wide (e.g., facilitating seamless data transitions from Machine Learning to Protein Sciences to Assay Technologies) to expedite the drug discovery feedback loop.
- Full-Stack Development: Build data systems and internal web applications (utilizing React and Python) to empower stakeholders to review, visualize, and communicate complex scientific data effectively.
- Mentorship & Leadership: Act as a senior technical advisor within the broader Engineering team.

