Using Biobank Data to Estimate Genetic Variants Associated with Mortality in COVID-19 Patients


Coronavirus disease 2019 (COVID-19) is a highly infectious disease caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)1. The severity of COVID-19 and course of the infection is highly heterogenous. The majority of COVID-19 cases only have mild or no symptoms, while some of the patients develop serious health outcome. A cross-sectional study in the UK showed that patients with existing conditions such as diabetes, cardiovascular diseases, hypertension, or chronic respiratory diseases were at higher risk of death2. Other studies have shown that some ethnic groups have increased risk of death from COVID-193. These observations suggest that there might be genetic determinants which predispose a subgroup of patients to more severe COVID-19 outcomes. Therefore, there is a need for understanding genetic basis of heterogeneous susceptibility to COVID-19.

The UK Biobank has released the testing results of COVID-19 from 12,428 participants, including 1,778 (14.31%) infected cases and 445 deaths related to COVID-194. This dataset is accompanied by already available health care data, genetic data and mortality data, and offers a unique resource for assessing the genetic determinants of COVID-19 susceptibility, severity, and mortality. In this study,  Hu and colleagues, performed a genome-wide association study (GWAS) exploiting super-variates in statistical genetics to identify potential risk loci contributing to the COVID-19 mortality. A super-variant is a combination of alleles in multiple loci in analogue to a gene. The rationale for this analysis was two-fold: First, COVID-19 infections require environmental exposure and the genetic contribution may be limited relative to the environmental exposure, while mortality may have a stronger genetic effect. Second, COVID-19 is a complex syndrome, which may reflect interacting genomic factors, and the analysis with super-variants enables leveraging multiple gene interactions5.

After analysis, the identified super-variants were mapped to annotated genes. The most interesting signal belonged to a super-variant associated with the gene DNAH7. This gene encodes dynein axonemal heavy chain 7, which is a component of the inner dynein arm of ciliary axonemes. A recently published paper showed that gene DNAH7 is the most downregulated gene after infecting human bronchial epithelial cells with SARS-CoV2. The authors speculated that the down-regulation of DNAH7 causes the reduction of function of respiratory cilia. These results suggest that COVID-19 patients with variations in gene DNAH7 have higher risk for dying from COVID-19. An existing hypothesis is that the disruption of DNAH7 gene function may result in ciliary dysmotility and weakened mucociliary clearance capability, which leads to severe respiratory failure, a likely cause of COVID-19 death. Other super-variants that were detected were associated with genes such as SLC39A10 and CLUAP1 that are associated with immune cell homeostasis and ciliary movement, respectively. These findings underscore the importance of proper functioning of respiratory cilia in COVID-19 patients, which may be an important site in host-pathogen interaction during SARS-CoV2 infection of airways as well as a potential therapeutic target.

According to the authors “We identify 8 potential genetic risk loci for the mortality of COVID-19. These findings may provide timely evidence and clues for better understanding the molecular pathogenesis of COVID-19 and genetic basis of heterogeneous susceptibility, with potential impact on new therapeutic options.”




  1. Zhu N., et al., A novel coronavirus from patients with pneumonia in China, 2019. New England Journal of Medicine, 2020.
  2. Docherty A.B., et al., Features of 20133 UK patients in hospital with covid-19 using the ISARIC WHO Clinical Characterisation Protocol: prospective observational cohort study. bmj, 2020. 369.
  3. Stoian A.P., et al., Gender differences in the battle against COVID −19: impact of genetics, comorbidities, inflammation and lifestyle on differences in outcomes. International journal of clinical practice, 2020: p. e13666.
  4. Armstrong J., et al., Dynamic linkage of COVID-19 test results between public health england’s second generation surveillance system and UK Biobank.[Google Scholar]. Microb Genomics, 2020.
  5. Hu J, Li C, Wang S, Li T, Zhang H. Genetic variants are identified to increase risk of COVID-19 related mortality from UK Biobank data. Preprint. medRxiv. 2020;2020.11.05.20226761. Published 2020 Nov 9. doi:10.1101/2020.11.05.20226761