We are looking for a Senior Genomics Data Engineer to join our Helix Clinicogenomic Data Engineering team. As a Senior Genomics Data Engineer, you will develop innovative solutions to simplify complex genomic data and design, build, and optimize robust data pipelines for processing large-scale genomic and clinical data.
Requirements
- Bachelor's/Master's degree in Computer Science, Bioinformatics, Engineering or a related field with 5+ years of experience
- Deep domain knowledge in molecular biology, next-generation sequencing, or genomics
- Demonstrated experience in processing a variety of large scale genetic data formats (exome/whole genome), including but not limited to VCF, CRAM, BAM, and PLINK.
- Strong experience using industry-standard bioinformatics tools such as bcftools, htslib, and samtools.
- Experience with genomic data-reduction techniques, such as PCA
- Expert-level proficiency in Python
- Proven experience designing and building distributed systems on AWS, including expertise with services like Glue, EMR, S3, Lambda, and DynamoDB.
- Proficiency with infrastructure-as-code frameworks (e.g., AWS CDK, Terraform).
- Expertise with ETL pipeline automation and workflow management tools such as Airflow, AWS Glue, AWS Step Functions, and CI/CD
- Familiarity with database design, data manipulation, and data quality techniques
- Demonstrated ability to thrive in a fast-paced, adaptable environment.
Benefits
- Comprehensive Health Insurance with Date of Hire eligibility
- Above average employer paid premium coverage
- 12 weeks Helix Paid Parental Leave option
- 401(k) with employer matching of up to 3% and 100% Vesting on the Date of Hire
- Comprehensive Well-Being Benefits
- Flexible PTO
- Remote options for many roles and a home office stipend