Phylogenetic Assignment of Named Global Outbreak Lineages
The primary software tool for assigning SARS-CoV-2 genome sequences to Pango lineages. Trusted by researchers and public health agencies worldwide for accurate, rapid lineage classification.
Pangolin is an open-source software tool that assigns the most likely Pango lineage to SARS-CoV-2 query sequences. Using machine learning models and phylogenetic placement algorithms, it provides rapid, accurate lineage classification essential for genomic surveillance.
Process thousands of sequences quickly using optimized algorithms and models.
High-accuracy lineage assignments validated against gold-standard designations.
pangolin-data is updated regularly with new lineage definitions.
Free to use under GPL-3.0 license. Contribute on GitHub.
Pangolin can be installed via Bioconda (recommended) or pip. It requires Python 3.8+ and works on Linux, macOS, and Windows (via WSL).
Install via Conda (Recommended)
The easiest method with all dependencies managed automatically.
Update pangolin-data
Keep your lineage definitions current with regular updates.
Run Analysis
Process your FASTA files with a single command.
Pangolin uses a multi-step pipeline to assign lineages with high accuracy.
FASTA sequences are validated and aligned to the reference genome.
Sequences are scanned for constellation patterns to identify VOCs.
Sequences are placed on a global phylogenetic tree using UShER.
CSV report with lineage assignments, confidence scores, and QC metrics.
Pangolin outputs a detailed CSV report with multiple fields for each sequence.
| Field | Description |
|---|---|
| taxon | Sequence name from the input FASTA file |
| lineage | Assigned Pango lineage (e.g., JN.1.1, BA.2.86) |
| conflict | Phylogenetic conflict score (lower is better) |
| ambiguity_score | Ambiguity in assignment (lower is better) |
| scorpio_call | Constellation-based variant call (e.g., Omicron-like) |
| scorpio_support | Support score for constellation match |
| scorpio_conflict | Conflict in constellation assignment |
| scorpio_notes | Additional notes on the assignment |
| version | Pangolin and data versions used |
| pangolin_version | Pangolin software version |
| scorpio_version | Scorpio version |
| constellation_version | Constellation definitions version |
| is_designated | Whether lineage is officially designated |
| qc_status | Quality control status (pass/fail) |
| qc_notes | Quality control notes |
| note | Additional processing notes |