The emergence of high-fitness variants accelerates the slowdown of genome heterogeneity in the coronavirus

  1. Oliver, José L. 1
  2. Galván, Pedro Bernaola 2
  3. Perfectti, Francisco 1
  4. Martín, Cristina Gómez 1
  5. Castiglione, Silvia 3
  6. Raia, Pasquale 3
  7. Verdú, Miguel 4
  8. Moya, Andrés 5
  1. 1 Department of Genetics, Faculty of Sciences, University of Granada, 18071, Granada, Spain
  2. 2 Department of Applied Physics II and Institute Carlos I for Theoretical and Computational Physics, University of Málaga, 29071, Málaga, Spain
  3. 3 Dipartimento di Scienze della Terra, dell'Ambiente e delle Risorse, Università di Napoli Federico II, 80126, Napoli, Italy
  4. 4 7Centro de Investigaciones sobre Desertificación, Consejo Superior de Investigaciones Científicas (CSIC), University of València and Generalitat Valenciana, 46113, Valencia, Spain
  5. 5 Institute of Integrative Systems Biology (I2Sysbio), University of València and Consejo Superior de Investigaciones Científicas (CSIC), 46980, Valencia, Spain

Editorial: Zenodo

Any de publicació: 2022

Tipus: Dataset

DOI: 10.5281/ZENODO.6655724 GOOGLE SCHOLAR lock_openAccés obert editor

Resum

Supplement of the paper “The emergence of high-fitness variants accelerates the slowdown of genome heterogeneity in the coronavirus” Since the outbreak of the COVID-19 pandemic, the SARS-CoV-2 coronavirus accumulated an important amount of genome variability through mutation and recombination. To test evolutionary trends that could inform us on the adaptive process of the virus to its human host, we compute a genome-wide measure of Sequence Compositional Complexity (<em>SCC</em>) in high-quality coronavirus genomes from across the globe, covering the full span of the pandemic. In early samples, we find no statistical support for any trend in <em>SCC</em> values over time, although the virus genome appears to evolve faster than Brownian Motion expectation. However, in samples taken after the emergence of Variants of Concern with higher transmissibility, and controlling for phylogenetic and sampling effects, we detect a declining trend for <em>SCC</em> and an increasing one for its absolute evolutionary rate. This means that the decline in <em>SCC</em> itself accelerated over time, and that increasing fitness of variant genomes lead to a reduction of their genome sequence heterogeneity. Supplementary files <strong>File</strong> <strong>Description</strong> SupplementaryTables S1-S18.xlsx The strain name, the collection date, and the SCC values for each analyzed genome. SupplementaryTableS19.pdf A complete list acknowledging all originating and submitting laboratories for the sequence data in GISAID EpiCoV on which these analyses are based. SupplementaryTable S20.pdf A complete list acknowledging the authors, originating and submitting laboratories of the genetic sequences we used for the analysis of the Nextstrain sample. PhylogeneticTimetrees_NexusFormat.zip Phylogenetic timetrees (Nexus format). PhylogeneticTimetrees_NewickFormat.zip Phylogenetic timetrees (Newick format). SCCdata.zip SCC data.