Abstract Title
Screening Norovirus Genetic Diversity in Public Datasets using Aitne, a Novel Open Source Genotyper
Presenter
Dennis Schmitz, Erasmus Medical Center / Netherlands Institute of Public Health and the Environment (RIVM)
Co-Author(s)
Dennis Schmitz1,2, Remie Janssen3, Annelies De Rooij1, Annelies Kroneman1, Stefan T. van der Krieken2, Harry Vennema1, Karim Hajji1, Marion P.G. Koopmans2, Jeroen F.J. Laros3,4, Miranda De Graaf2
1National Institute of Public Health and the Environment, Center for Infectious Disease Control, 3720BA Bilthoven, The Netherlands
2Erasmus University Medical Center, Viroscience, 3015GB Rotterdam, The Netherland
3 National Institute of Public Health and the Environment, Department of Bio-Informatics and Computational Services, 3720BA Bilthoven, The Netherlands
4Leiden University Medical Center, Department of Human Genetics, 2333ZA, Leiden, The Netherlands
Abstract Category
Molecular Epidemiology & Evolution
Abstract
Norovirus surveillance relies on genetic comparison of outbreak strains to a curated set of reference strains. Separate typing of the polymerase and capsid genes enables the detection and investigation of (novel) recombinants. However, a tool to screen the genetic diversity of large datasets and flag potential novel clades based on genetic distance has been lacking.
We developed Aitne, an open-source norovirus genotyper based on the latest nomenclature. Aitne splits query sequences into ORF1 and ORF2/3 subsequences, puts them in a sense-orientation, and performs maximum likelihood phylogenetic analysis. Genetic outliers are flagged and grouped based on bootstrap values and branch-length for manual curation and evaluation.
Against a reference and random validation dataset, Aitne achieved 98.8-100% accuracy, including the correct assignment of cross-genogroup recombinants such as GIV[GVI]. In under 46 hours, all 56.145 norovirus sequences available at GenBank were analyzed on a consumer-grade laptop. Although most sequences were partial, both the ORF1 and ORF2 genotypes could be determined for 28.0% of strains. A total of 94 outlier clades were flagged, leading to the tentative identification of a new genogroup, a novel GX genotype, and clades within several genotypes that may represent underrecognized or emerging lineages. When applied to the novel GII.4 San Francisco variant, several new strains were identified, with the earliest from 2015 in central Africa.
Outlier-based investigation uncovered underrecognized diversity at the genogroup, genotype and variant levels, in some cases independently confirming previous studies. Aitne can easily be incorporated into local workflows, and its framework is adaptable to other viruses.
We developed Aitne, an open-source norovirus genotyper based on the latest nomenclature. Aitne splits query sequences into ORF1 and ORF2/3 subsequences, puts them in a sense-orientation, and performs maximum likelihood phylogenetic analysis. Genetic outliers are flagged and grouped based on bootstrap values and branch-length for manual curation and evaluation.
Against a reference and random validation dataset, Aitne achieved 98.8-100% accuracy, including the correct assignment of cross-genogroup recombinants such as GIV[GVI]. In under 46 hours, all 56.145 norovirus sequences available at GenBank were analyzed on a consumer-grade laptop. Although most sequences were partial, both the ORF1 and ORF2 genotypes could be determined for 28.0% of strains. A total of 94 outlier clades were flagged, leading to the tentative identification of a new genogroup, a novel GX genotype, and clades within several genotypes that may represent underrecognized or emerging lineages. When applied to the novel GII.4 San Francisco variant, several new strains were identified, with the earliest from 2015 in central Africa.
Outlier-based investigation uncovered underrecognized diversity at the genogroup, genotype and variant levels, in some cases independently confirming previous studies. Aitne can easily be incorporated into local workflows, and its framework is adaptable to other viruses.