Projects
Explore genomics and data science projects where scientific creativity meets rigorous methodology, and every detail is designed to turn complex sequencing data into clear, actionable insight.
Genome Assembly & Annotation
High‑quality genome assemblies and annotations that provide a solid foundation for downstream discovery.
- Long-read and hybrid aseemblies
- Repeat-aware masking and TE identification
- Materials that complement each other
- Comparative genomics across samples
Clinical Exome & Cancer Genomics
Clinical‑grade exome and panel workflows that support oncology decision‑making.
- Trio/single exome analysis for germline and somatic variants
- 70‑gene bladder cancer panel (NMIBC vs MIBC stratification)
- Patient‑by‑gene mutation matrices and burden summaries
- Clinician‑ready reports and figures for theses/publications


Variant Calling & Interpretation
End‑to‑end variant analysis pipelines that turn raw reads into prioritized variants and clear reports.
- Germline and somatic variant calling
- Rigorous filtering and QC
- Annotation with public and custom databases
- Summaries tailored for scientists and clinicians
Metagenomics & Microbiome Pipelines
Nextflow‑based metagenomic and viral pipelines for complex microbiome studies.
- Shotgun metagenome QC, assembly, and taxonomic/functional profiling
- Viral genome analysis and phage–bacteria interaction workflows
- Containerized, reproducible pipelines on Slurm HPC
- SOPs and documentation for experimental collaborators


Transposable Elements & Genome Evolution
In‑depth characterization of repetitive elements to understand genome architecture and evolution.
- TE discovery and classification
- Chromosome‑level TE landscapes
- Association with recombination and gene regions
- Manuscript‑ready figures and statistics
Reproducible NGS Workflow Engineering
Production‑grade pipelines built for scale, reuse, and auditability.
- Nextflow/Snakemake workflows for WGS/WES, RNA‑seq, metagenomics
- Docker/Singularity‑based, CI/CD‑friendly designs
- Performance tuning on Slurm HPC and basic cloud readiness
- Clear versioning, logging, and SOPs for teams


RNA‑seq & Expression Analysis
Transcriptome‑level insights that connect genomic variation to functional readouts.
- Differential expression analysis
- Transcript assembly and quantification
- Pathway and enrichment analysis
- Integrated visualization of key signatures
Open‑Source Analysis & Visualization Tools
Reusable code resources that accelerate analysis and figure generation.
- Genome dot‑plot tool for synteny and structural variation
- GC content and k‑mer uniqueness visualization
- Sex‑chromosome circos plots for coverage/SNP density
- DESeq2 and qPCR R Markdown templates for publication‑ready plots


Precision Medicine Ready Pipelines
Pipelines designed with clinical and translational teams in mind.
- Reproducible, version‑controlled workflows
- Documentation for validation and audits
- Configurable reporting templates
- Integration with existing lab or bioinformatics infrastructure
Machine Learning
Applied machine and deep learning to connect molecular profiles and imaging data, supporting biomarker discovery and hypothesis generation in cancer and infectious disease.
- Built MATLAB pipelines for image‑based Zika‑infected cell detection using segmentation and feature extraction.
- Integrated TCGA clinical, mutation, and expression data in R to stratify HR‑mutant vs HR‑intact tumors.
- Applied clustering, PCA/3D‑PCA, and visualization to reveal molecular subtypes and candidate biomarkers.
- Generated publication‑ and grant‑ready figures (heatmaps, PCA plots, co‑mutation maps) for R21/R01 applications.


