About the job
Join our team as a Bioinformatics Software Engineer, where you will play a vital role in designing innovative software solutions that enhance research productivity and reliability. You will collaborate with researchers to develop cutting-edge software and data services that adhere to modern standards of reproducibility.
Key Responsibilities:
We are seeking an exceptional Bioinformatics Software Engineer with expertise in creating, deploying, and maintaining scalable bioinformatics pipelines within cloud environments. You will take charge of the code base that underpins large-scale genomic processing and analysis workflows at the SMaHT Data Analysis Center, which handles multi-omic data, including Illumina, PacBio, and ONT Whole Genome Sequencing (WGS) and RNA-Seq data. The ideal candidate will possess a strong grasp of next-generation sequencing (NGS) data analysis, workflow automation, cloud computing, and best practices in cloud software engineering. This position is essential for supporting research and production environments where reproducibility, scalability, and performance are paramount.
Job Duties:
- Design, implement, and maintain bioinformatics pipelines for high-throughput sequencing data, such as alignment, quality control, and variant calling for WGS and RNA-Seq, similar to those in our existing repositories: https://github.com/smaht-dac/main-pipelines.
- Develop reproducible, thoroughly tested, and automated workflows utilizing workflow management systems (especially CWL).
- Architect and manage AWS-based computational infrastructure to facilitate pipeline execution, ensuring automated deployment, scaling, and monitoring.
- Containerize workflows with Docker or similar technologies for effective execution and portability.
- Integrate CI/CD tools to automate testing, deployment, and version control, guaranteeing data integrity and proper pipeline execution.
- Create utility tools for metadata management, file integrity verification, conversion (e.g., VCF, BAM to CRAM), and integration with the SMaHT Data Portal.
- Collaborate across teams with research scientists, engineers, and IT personnel to refine requirements and deliver top-notch solutions.
- Document code, workflows, and infrastructure configurations in a clear and accessible manner.
