Jump to content
 







Main menu
   


Navigation  



Main page
Contents
Current events
Random article
About Wikipedia
Contact us
Donate
 




Contribute  



Help
Learn to edit
Community portal
Recent changes
Upload file
 








Search  



























Create account

Log in
 









Create account
 Log in
 




Pages for logged out editors learn more  



Contributions
Talk
 



















Contents

   



(Top)
 


1 Context  





2 Description  



2.1  Lineage designation  





2.2  Model training  





2.3  Lineage assignation  







3 Availability  





4 Creators and developers  





5 See also  





6 References  





7 External links  














Phylogenetic Assignment of Named Global Outbreak Lineages






Català
Dansk
Español
Français
Bahasa Indonesia
Norsk bokmål
Português
Svenska

 

Edit links
 









Article
Talk
 

















Read
Edit
View history
 








Tools
   


Actions  



Read
Edit
View history
 




General  



What links here
Related changes
Upload file
Special pages
Permanent link
Page information
Cite this page
Get shortened URL
Download QR code
Wikidata item
 




Print/export  



Download as PDF
Printable version
 




In other projects  



Wikimedia Commons
 


















From Wikipedia, the free encyclopedia
 


Phylogenetic Assignment of Named Global Outbreak Lineages
Initial release30 April 2020; 4 years ago (2020-04-30)
Stable release

4.3.1[1] Edit this on Wikidata / 26 July 2023; 10 months ago (26 July 2023)

Repositorygithub.com/cov-lineages/pangolin
Written inPython
LicenseGNU General Public License v3.0
Websitepangolin.cog-uk.io Edit this on Wikidata

The Phylogenetic Assignment of Named Global Outbreak Lineages (PANGOLIN) is a software tool developed by Dr. Áine O'Toole[2] and members of the Andrew Rambaut laboratory, with an associated web application developed by the Centre for Genomic Pathogen SurveillanceinSouth Cambridgeshire.[3] Its purpose is to implement a dynamic nomenclature (known as the Pango nomenclature) to classify genetic lineages for SARS-CoV-2, the virus that causes COVID-19.[4] A user with a full genome sequence of a sample of SARS-CoV-2 can use the tool to submit that sequence, which is then compared with other genome sequences, and assigned the most likely lineage (Pango lineage).[5] Single or multiple runs are possible, and the tool can return further information regarding the known history of the assigned lineage.[5] Additionally, it interfaces with Microreact, to show a time sequence of the location of reports of sequenced samples of the same lineage.[5] This latter feature draws on publicly available genomes obtained from the COVID-19 Genomics UK Consortium and from those submitted to GISAID.[5] It is named after the pangolin.

Context[edit]

PANGOLIN is a key component underpinning the Pango nomenclature system.[6]

As described in Andrew Rambaut et al. (2020),[4] a Pango lineage is described as a cluster of sequences that are associated with an epidemiological event, for instance an introduction of the virus into a distinct geographic area with evidence of onward spread. Lineages are designed to capture the emerging edge of the pandemic and are at a fine-grain resolution suitable to genomic epidemiological surveillance and outbreak investigation.[citation needed]

Both the tool and the PANGOLIN nomenclature system have been used extensively during the COVID-19 pandemic.[4][7][8]

Description[edit]

Lineage designation[edit]

Distinct from the PANGOLIN tool, Pango lineages are regularly, manually curated based on the current globally circulating diversity. A large phylogenetic tree is constructed from an alignment containing publicly available SARS-CoV-2 genomes, and sub-clusters of sequences in this tree are manually examined and cross-referenced against epidemiological information to designate new lineages; these can be designated by data producers, and lineage suggestions can be submitted to the Pango team via a GitHub issue request.[9][10][further explanation needed]

Model training[edit]

These manually curated lineage designations, and the associated genome sequences, are the input into the machine learning model training. This model, both the training and the assignment, has been termed 'pangoLEARN'. The current version of pangoLEARN uses a classification tree, based on the scikit-learn implementation[11] of a decision tree classifier.

Lineage assignation[edit]

Originally, PANGOLIN used a maximum-likelihood-based assignment algorithm to assign query SARS-CoV-2 the most likely lineage sequence. Since the release of Version 2.0 in July 2020, however, it has used the 'pangoLEARN' machine-learning-based assignment algorithm to assign lineages to new SARS-CoV-2 genomes.[12] This approach is fast and can assign large numbers of SARS-CoV-2 genomes in a relatively short time.[13]

Availability[edit]

PANGOLIN is available as a command-line-based tool, downloadable from Conda and from a GitHub repository,[12] and as a web-application[14] with a drag-and-drop graphical user interface. The PANGOLIN web application has assigned more than 512,000 unique SARS-CoV-2 sequences as of January 2021.[citation needed]

Creators and developers[edit]

PANGOLIN was created by Áine O'Toole and the Rambaut lab and released on 5 April 2020. The main developers of PANGOLIN are Áine O'Toole and Emily Scher; many others have contributed to various aspects of the tool, including Ben Jackson, J.T. McCrone, Verity Hill, and Rachel Colquhoun of the Rambaut Lab.[5]

The PANGOLIN web application was developed by the Centre for Genomic Pathogen Surveillance,[14] namely Anthony Underwood, Ben Taylor, Corin Yeats, Khali Abu-Dahab, and David Aanensen.[5]

See also[edit]

References[edit]

  1. ^ "Release 4.3.1". 26 July 2023. Retrieved 1 August 2023.
  • ^ O’Toole, Áine; Scher, Emily; Underwood, Anthony; Jackson, Ben; Hill, Verity; McCrone, John T; Colquhoun, Rachel; Ruis, Chris; Abu-Dahab, Khalil; Taylor, Ben; Yeats, Corin; Du Plessis, Louis; Maloney, Daniel; Medd, Nathan; Attwood, Stephen W; Aanensen, David M; Holmes, Edward C; Pybus, Oliver G; Rambaut, Andrew (5 July 2021). "Assignment of Epidemiological Lineages in an Emerging Pandemic Using the Pangolin Tool". Virus Evolution. 7 (2): veab064. doi:10.1093/ve/veab064. PMC 8344591. PMID 34527285.
  • ^ "Real-Time Epidemiology for COVID-19". www.pathogensurveillance.net. Archived from the original on 17 January 2021. Retrieved 22 January 2021.
  • ^ a b c Rambaut, A.; Holmes, E.C.; O’Toole, Á.; et al. (2020). "A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology". Nature Microbiology. 5 (11): 1403–1407. doi:10.1038/s41564-020-0770-5. PMC 7610519. PMID 32669681. S2CID 220544096.
  • ^ a b c d e f "Pangolin web application release". virological.org. May 2020. Archived from the original on 10 February 2021. Retrieved 18 February 2021.
  • ^ Rambaut, Andrew; Holmes, Edward C.; o'Toole, Áine; Hill, Verity; McCrone, John T.; Ruis, Christopher; Du Plessis, Louis; Pybus, Oliver G. (15 July 2020). "Addendum: A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology". Nature Microbiology. 6 (3): 415. doi:10.1038/s41564-021-00872-5. PMC 7845574. PMID 33514928.
  • ^ Pipes, Lenore; Wang, Hongru; Huelsenbeck, John P; Nielsen, Rasmus (9 December 2020). Malik, Harmit (ed.). "Assessing Uncertainty in the Rooting of the SARS-CoV-2 Phylogeny". Molecular Biology and Evolution. 38 (4). Oxford University Press (OUP): 1537–1543. doi:10.1093/molbev/msaa316. ISSN 0737-4038. PMC 7798932. PMID 33295605. Archived from the original on 10 December 2020. Retrieved 22 January 2021.
  • ^ Jacob, Jobin John; Vasudevan, Karthick; Pragasam, Agila Kumari; Gunasekaran, Karthik; Kang, Gagandeep; Veeraraghavan, Balaji; Mutreja, Ankur (22 December 2020). "Evolutionary tracking of SARS-CoV-2 genetic variants highlights intricate balance of stabilizing and destabilizing mutations". bioRxiv 10.1101/2020.12.22.423920. Phylogenetic Assignment of Named Global Outbreak LINeages tool (PANGOLIN) has been the most widely used tool for lineage assignment to newly emerging variants.
  • ^ "pangoLEARN Store of the trained model for PANGOLIN to access". GitHub: cov-lineages/pangoLEARN. Archived from the original on 3 January 2021. Retrieved 13 February 2021.
  • ^ "PANGO lineages". cov-lineages.org. Archived from the original on 28 February 2021. Retrieved 4 March 2021.
  • ^ "sklearn.tree.DecisionTreeClassifier". scikit-learn.org. Archived from the original on 19 February 2021. Retrieved 13 February 2021.
  • ^ a b "cov-lineages/pangolin". GitHub: cov-lineages/pangolin. Archived from the original on 15 February 2021. Retrieved 13 February 2021.
  • ^ "pangoLEARN PANGOLIN 2.0: pangoLEARN description". cov-lineages.org. Archived from the original on 4 November 2021. Retrieved 19 November 2021. The model was trained using ~60,000 SARS-CoV-2 sequences from GISAID... training this model takes approximately 30 minutes on our hardware
  • ^ a b "Pangolin COVID-19 Lineage Assigner". pangolin.cog-uk.io. Archived from the original on 10 February 2021. Retrieved 13 February 2021.
  • External links[edit]


    Retrieved from "https://en.wikipedia.org/w/index.php?title=Phylogenetic_Assignment_of_Named_Global_Outbreak_Lineages&oldid=1194996210"

    Categories: 
    Phylogenetics software
    Genome databases
    Medical software
    Medical responses to the COVID-19 pandemic
    Hidden categories: 
    Articles with short description
    Short description is different from Wikidata
    All articles with unsourced statements
    Articles with unsourced statements from August 2022
    Wikipedia articles needing clarification from March 2021
    Articles with unsourced statements from March 2021
     



    This page was last edited on 11 January 2024, at 20:35 (UTC).

    Text is available under the Creative Commons Attribution-ShareAlike License 4.0; additional terms may apply. By using this site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization.



    Privacy policy

    About Wikipedia

    Disclaimers

    Contact Wikipedia

    Code of Conduct

    Developers

    Statistics

    Cookie statement

    Mobile view



    Wikimedia Foundation
    Powered by MediaWiki