Haemo Mito Pipeline

A script-first bioinformatics pipeline for generating mitochondrial haplotypes from long-read amplicon data targeting ~6 kb haemosporidian mitochondrial genomes.

This software is developed and maintained by the Escalante–Pacheco Lab (Temple University) as part of our haemosporidian genomics toolkit.

This documentation site covers:

  • What the pipeline does (and what it does not do)

  • How to run it (CLI + optional desktop GUI)

  • How to interpret the outputs (PDF/JSON/FASTA/TSV)

  • The experimental protocol context (primer design + sequencing workflow)

Note

This pipeline is designed for research and biodiversity/population studies. It is not intended as a diagnostic workflow.

Pipeline overview

At a high level, the pipeline runs:

  1. FASTQ → Q-filter + subsample (default: Q≥30, n=5000)

  2. Subsample FASTA → MAFFT MSA (with orientation adjustment)

  3. MSA + FASTQ → haplotypes/OTUs + report

  4. (Optional) Local BLAST per haplotype against a user-provided reference DB

  5. All-vs-all haplotype distance matrix

Pipeline overview diagram

A simplified view of the main steps.