Protein Structure and Sequence Reanalysis of 2019-nCoV Genome Refutes Snakes as Its Intermediate Host and the Unique Similarity between Its Spike Protein Insertions and HIV-1.
As the infection of 2019-nCoV coronavirus is quickly developing into a global pneumonia epidemic, the careful analysis of its transmission and cellular mechanisms is sorely needed. In this Communication, we first analyzed two recent studies that concluded that snakes are the intermediate hosts of 2019-nCoV and that the 2019-nCoV spike protein insertions share a unique similarity to HIV-1. However, the reimplementation of the analyses, built on larger scale data sets using state-of-the-art bioinformatics methods and databases, presents clear evidence that rebuts these conclusions. Next, using metagenomic samples from Manis javanica, we assembled a draft genome of the 2019-nCoV-like coronavirus, which shows 73% coverage and 91% sequence identity to the 2019-nCoV genome. In particular, the alignments of the spike surface glycoprotein receptor binding domain revealed four times more variations in the bat coronavirus RaTG13 than in the Manis coronavirus compared with 2019-nCoV, suggesting the pangolin as a missing link in the transmission of 2019-nCoV from bats to human.