COVID lab-leak theory: ‘Rare’ genetic sequence doesn’t mean the virus was engineered

By Keith Grehan and Natalie Kingston

June 25, 2021

The theory that the COVID-19 pandemic was triggered by the Sars-CoV-2 virus being leaked from the Wuhan Institute of Virology in China was recently given new life following an explosive article in the Wall Street Journal (WSJ) in which the authors claimed “the most compelling reason to favor the lab leak hypothesis is firmly based in science.” But does the science really support the claim that the virus was engineered in a laboratory?

Understanding the origin of a viral outbreak can provide scientists with important information about viral lineages and allow steps to be put in place to avoid similar outbreaks in the future. As such, the origin of Sars-CoV-2 has been debated from the beginning of the pandemic and remains an active topic of discussion among scientists.

It has long been known that viruses similar to the original Sars-CoV that causes Sars are found in bats. These viruses are well studied in China, where the 2002 Sars outbreak originated. But related viruses have been found globally.

Unsurprisingly, coronaviruses are again involved in a pandemic, the third such event in the 21st century — first Sars, then Mers, now COVID-19. While a natural origin seems likely — and many have long warned about the danger of wildlife circulating viruses — scientists shouldn’t jump to conclusions.

An important way scientists can determine the origin of a virus is by looking at its genome. In the WSJ article, the authors, Prof Richard Muller, an astrophysicist, and Dr Steven Quay, physician and chief executive of Atossa Therapeutics, claim Sars-CoV-2 has “genetic fingerprints” of a lab-origin virus. They say that the presence of a particular genetic sequence (CGG-CGG) is a sign that the virus originated in a lab.

To understand the claims being made, we must first understand the genetic code. When a virus infects a cell, it hijacks the cellular machinery, providing instructions (genome) to produce more copies of itself. This genome comprises a long series of molecules called nucleotides, each of which is represented by the letters A, C, G or U.

A group of three nucleotides (known as a codon) provides the instruction for a cell to make an amino acid, the most basic molecular building block of living things. Most amino acids are encoded by several different codons. CGG is one of six possible codons that instruct the cell to add the amino acid arginine.

The authors of the WSJ article argue that Sars-CoV-2 originated in a lab based on the presence of a “CGG-CGG” sequence. They claim this is a “readily available and convenient” codon pair that scientists prefer to use to produce the amino acid arginine. But to anyone with an understanding of the techniques required for genetic modification, this double-CGG is usually no more difficult or easy to produce than any other pair of codons that encode arginines.

No reason CGG-CGG had to be made in lab

The authors claim that the CGG codon appears less frequently than the other five possible codons in betacoronaviruses (the family of coronaviruses to which Sars-CoV-2 belongs). If we look at related coronaviruses, the CGG codon encodes about 5 percent of all arginines in Sars-CoV compared with about 3 percent of all arginines in Sars-CoV-2. Though CGG is less common than other codons, the authors’ argument fails to provide a reason that the double-CGG sequence could not exist naturally.

The authors argue that recombination (when viruses that infect the same host share genetic material) was the most likely way in which Sars-CoV-2 was able to obtain the double-CGG sequence. They note that the double-CGG codon pair is not found in other members of this “class” of coronavirus, so natural recombination could not possibly generate a double-CGG. However, viruses do not just depend on preassembled segments of genetic material to evolve and expand their host range.

The authors also claim that mutation (random copying errors) is unlikely to generate the double-CGG sequence. But viruses evolve at a rapid rate, so much so that the accumulation of mutations is a common inconvenience of virological studies. Recombination is one way in which viruses evolve, but the authors’ dismissal of mutation as a source of viral change is an inaccurate description of reality.

The final claim that the first sequenced Sars-CoV-2 virus was ideally suited to the human host neglects evidence of viral circulation in local animal populations, animal-to-animal transmission, and the rapid evolution that is driving the increasing transmissibility of the newer variants. If the virus was ideally adapted to humans, why is so much further evolution evident?

Disappointingly, many other media articles appear to have accepted and repeated the claims from the WSJ piece. The origin of Sars-CoV-2 may remain unresolved, but there is no evidence presented in the WSJ piece that scientifically supports the concept of a lab leak of a genetically engineered virus.

Keith Grehan is a postdoctoral researcher in molecular biology and Natalie Kingston is a research fellow in virology, both at the University of Leeds.

This article was originally published on The Conversation.