Career Profile

Hello! My name is Dan and I am an experienced bioinformatics and data science professional with 6+ years of industrial experience working in data science and computational biology, specializing in analyzing complex biological datasets to drive innovation in biotech and pharma. Skilled in machine learning, statistical modeling, and programming (Python, R, SQL). Proven ability to lead projects, collaborate across disciplines, and deliver insights that advance research and product development. You can download a copy of my CV here.

Experiences

Senior Bioinformatics Scientist

2022 - Present
PacBio, San Diego & Menlo Park

Develop and implement bioinformatics workflows for secondary analysis of Next-Generation Sequencing (NGS) data, including PacBio HiFi, SBB, Illumina, and Nanopore platforms, supporting oncology and microbiome research for internal and external clients. Perform data analysis and create visualizations in Python and R Markdown to evaluate variant calling performance across NGS platforms.

Collaborate with R&D and marketing teams to design and execute experiments, contributing to the creation of marketing collateral.

Senior Data Scientist

2021 - 2022
Novozymes, San Diego & Davis

Develop and deploy automated, data science-driven infrastructure for microbiome research, integrating shotgun metagenomics, 16S, metabolomics, metatranscriptomics, and open-source databases. Conduct biological discovery of novel bacterial and phage probiotic candidates from human and animal gut samples to advance disease treatment and promote healthy development.

Senior Data Scientist

2019 - 2021
Biota Technology, San Diego

Leverage machine learning techniques to analyze microbial signatures from subsurface samples, enabling the identification of reservoir sources and their characteristics. By integrating metagenomic data with advanced algorithms, this approach uncovers patterns in microbial communities that correlate with specific reservoir conditions, such as temperature, pressure, and geochemical profiles. These insights aid in tracing reservoir connectivity, assessing production potential, and optimizing resource management strategies in energy exploration and production. Acquired by Novozymes in March 2021.

Projects

Below are brief summaries of exciting projects I've worked on

SeqScreen - Rapid and robust characterization of unknown pathogenic DNA/RNA sequences.
CRISPRs - Identifying novel CRISPR families from environmental samples.
Database influence on microbial classification - Examining the influence of the database over time on k-mer-based lowest common ancestor taxonomic classification.
VIROME - Viral metagenomics analysis pipeline and frontend web application to explore viral diversity from environmental and host-associated samples.
Viral profiling of irrigation water - Shotgun metagenomic approach to characterize the taxonomic and functional variations of microbial communities within surface water sources of crop irrigation.

Skills & Proficiency

Python & Jupyter

R ggplot2 & Matplotlib

Bioinformatics tools and libraries

Snakemake & Nextflow

SQL & MongoDB

Scikit-learn & Keras