About the job
At Merge Labs, we are at the forefront of scientific innovation, dedicated to harmonizing biological and artificial intelligence to enhance human potential and experiences. Our mission is to pioneer groundbreaking brain-computer interface technologies that communicate with the brain at unprecedented speeds, integrate seamlessly with advanced AI, and ensure safety and accessibility for all users.
About Our Team:
Our team is committed to transforming advanced brain-computer interface visions into tangible algorithms. By integrating knowledge from synthetic biology, neuroscience, device physics, signal processing, and machine learning, we create effective methodologies to connect human intelligence with artificial intelligence. Our work involves designing experiments, developing analytical frameworks, collecting data, training models, and optimizing performance to construct scalable Brain-AI systems. We prioritize urgency while maintaining a balance between creative exploration and engineering rigor, as we believe that enhancing human ability, agency, and experience is one of the most pressing challenges of our era.
Role Overview:
As the most senior data engineer on our team, you will take charge of the data pipelines that capture, process, and deliver the essential data driving Merge’s molecular optimization platform. Your role will include converting diverse laboratory outputs into structured, queryable datasets that enable scientific analysis and closed-loop machine learning. Collaborating closely with experimentalists, you will establish data standards and metadata conventions, and work alongside ML engineers to ensure results are integrated into production-grade systems.
This position reports to the Head of Software and involves extensive cross-functional collaboration—spanning software engineering, data architecture, and scientific informatics. As part of the Core Software team, you will be supported by infrastructure specialists and will coordinate with the Application Development Lead to ensure comprehensive scientific and user input capture.
Your Responsibilities Include:
Developing and maintaining ingestion pipelines from laboratory instruments into centralized data storage.
Designing schemas and capturing metadata standards for experimental data.
Implementing post-processing pipelines to generate analysis-ready datasets for our scientific teams.
Setting up monitoring, alerting, and structured logging for data pipelines to ensure optimal operation.

