Computational Scientist I or II, Sequencing
Job ID: req1452
Employee Type: exempt full-time
Facility: Rockville: 9615 MedCtrDr
Location: 9615 Medical Center Drive, Rockville, MD 20850 USA
The Frederick National Laboratory is a Federally Funded Research and Development Center (FFRDC) sponsored by the National Cancer Institute (NCI) and operated by Leidos Biomedical Research, Inc. The lab addresses some of the most urgent and intractable problems in the biomedical sciences in cancer and AIDS, drug development and first-in-human clinical trials, applications of nanotechnology in medicine, and rapid response to emerging threats of infectious diseases.
Our core values of accountability, compassion, collaboration, dedication, integrity, and versatility serve as a guidepost for how we do our work every day in serving the public’s interest.
The Cancer Genomics Research Laboratory (CGR) investigates the contribution of germline and somatic genetic variation to cancer susceptibility and outcomes in support of the NCI's Division of Cancer Epidemiology and Genetics (DCEG), the world’s most comprehensive cancer epidemiology research group. CGR is located at the NCI-Shady Grove campus in Gaithersburg, MD and operated by Leidos Biomedical Research, Inc. We care deeply about discovering the genetic and environmental determinants of cancer, and new approaches to cancer prevention, through our contributions to the molecular, genetic, and epidemiologic research of the 70+ investigators in DCEG. Our bioinformaticians have both the passion to learn and the opportunity to apply their skills to our rich and varied short- and long-read sequencing datasets, generated in support of DCEG’s multidisciplinary family- and population-based studies. Working in concert with the epidemiologists, biostatisticians, and basic research scientists in DCEG’s intramural research program, CGR conducts targeted, whole-exome, and whole-genome sequencing studies, including analysis of germline and somatic variants, structural variation, copy number variation, metagenomics, transcriptomics, and more.
We are seeking an enthusiastic, creative, and collaborative bioinformatics scientist to support pipeline development and analysis for our broad portfolio of sequencing studies. If you have experience designing and deploying robust, reproducible, production-quality pipelines, then come join our talented team of bioinformaticians dedicated to understanding the genetics of cancer!
- Develop and maintain robust, tested pipelines for a wide variety of sequencing applications, with an emphasis on scalability, portability, and thorough documentation
- Deploy pipelines to HPC and cloud environments
- Lead efforts to identify and benchmark new tools and resources as needed to keep analyses current
- Thoughtfully synthesize results into clear presentations and concise summaries of work to support recommendations for next steps
- Collaborate closely with DCEG PIs on scientific manuscript development, submission, and revision activities with significant co-authorship and potentially first authorship opportunities
This position may be filled at the Computational Scientist I or II level.
To be considered for this position, you must minimally meet the knowledge, skills, and abilities listed below:
- Possession of a Doctoral degree from an accredited college/university in bioinformatics, statistics, computer science, genetics, computational biology or related field. Foreign degrees must be evaluated for U.S equivalency.
- Computational Scientist I - no experience required beyond a Doctoral degree
- Computational Scientist II - a minimum of two (2) years of progressively responsible scientific and/or complex system management/bioinformatics experience
- Extensive pipeline development experience, including collaborative coding and use of source control (e.g. git)
- Experience with Snakemake, make, or other workflow management systems
- Experience with various environment/dependency management tools (e.g. pip, venv, conda, mamba) and containers (e.g. Singularity, Docker)
- Experience managing large datasets and computational tasks in a Linux-based high-performance computing environment
- Proficiency with Bash, Python, Perl, R, C/C++, and/or JAVA
- Team-oriented with excellent written and verbal communication skills, organizational skills, and attention to detail; ability to organize and execute multiple projects in parallel
- Demonstrated ability to proactively remain up-to-date in current bioinformatics techniques and resources, and identify and benchmark novel software solutions against established reference datasets
Candidates with these desired skills will be given preferential consideration:
- Experience with software testing types including unit, integration, regression, and acceptance tests, as well as related packages (e.g. unittest, pytest, Test::More, TAP)
- Experience with CI/CD (GitLab CI/CD, Travis CI, CircleCI, etc.)
- Experience with documentation tools such as Sphinx or Doxygen
- Experience with Google Cloud, AWS, or managed cloud environments
- Familiarity with publicly available data sources and diverse genomic annotations (such as gnomAD/ExAC, ANNOVAR, VEP, snpEff, ClinVar, ClinGen, dbSNP, CIViC, COSMIC)
- Experience with databases (e.g. MySQL, FileMaker)
- Experience with machine learning
Equal Opportunity Employer (EOE) | Minority/Female/Disabled/Veteran (M/F/D/V) | Drug Free Workplace (DFW)