Genomic characteristics and clinical effect of the emergent SARS-CoV-2 B.1.1.7 lineage in London, UK: a whole-genome sequencing and hospital-based cohort study
Methods
Study design and setting
Viral genomes were sequenced from combined nose and throat swab samples taken from patients with SARS-CoV-2 infection collected from Nov 9, 2020, for patients acutely admitted to a ward at either University College London Hospitals (UCLH) or North Middlesex University Hospital (NMUH) on or before Dec 20, 2020, for any clinical reason. The study dates were selected because the first hospitalised patient with the B.1.1.7 variant was admitted on Nov 9, 2020, and the B.1.1.7 variant became dominant in both hospitals by Dec 20, with this date coinciding with a surge in hospitalisations that stretched the capacity of the health services. All hospitalised patients with a positive PCR test during this time period were eligible and included in the study.
Concerns have been raised around the emergence of variants of concern in long-shedding, immunocompromised or treated patients, especially when treatment modalities and prophylaxis target the spike protein (eg, convalescent plasma, monoclonal antibodies, and vaccination). Therefore, as part of the virological dataset, two pre-existing UCLH cohorts were analysed separately to investigate the prevalence of B.1.1.7 variant of concern (VOC)-defining mutations: 123 samples from a longitudinal study of 34 long-shedding patients, including immunocompromised patients who had remained PCR positive for more than 21 days and up to 196 days (median 33 days [IQR 26–64]), and 64 samples from a remdesivir-treated cohort of 32 patients (32 samples obtained before and 32 samples obtained after day 1 of treatment; samples were obtained a median of 5 days [IQR 3–10] before treatment and 13 days [6–19] after treatment).
To explore differences in the clinical severity associated with the B.1.1.7 and other lineages, we did a cohort study across our two centres. Inclusion criteria for this hospitalised cohort were individuals aged at least 18 years whose first PCR-positive SARS-CoV-2 result date and admission date met study criteria.
The clinical information and SARS-CoV-2 PCR samples were collected as part of routine clinical care. Data were extracted and analysed using permission granted by the National Health Service London Westminster Research Ethics Committee (IRAS 284088 ; REC 20/HRA/2505 ).
Viral detection for SARS-CoV-2
An array of SARS-CoV-2 RNA assays (Hologic Aptima TMA assay run on a Panther system [Hologic, San Diego, CA, USA], a laboratory-developed PCR run using the open access functionality of the Panther Fusion System [Hologic], a laboratory-developed extraction-free PCR assay, and the Cepheid Xpert Xpress [Cepheid, Sunnyvale, CA, USA]) were used in the diagnostic laboratory, including non-PCR assays such as transcription-mediated amplification assay, which does not allow for Ct reporting (ie, not inferring on quantitation). Therefore, as Ct values were not always available, samples for sequencing were not pre-selected according to Ct.
Next-generation sequencing (NGS) and genomic data analysis
using the V3 version of the ARTIC primer set from Integrated DNA Technologies (Coralville, IA, USA) to create tiled amplicons across the SARS-CoV-2 genome. Libraries were prepared using Nextera Flex and sequenced using Illumina MiSeq 500v2 kits (Nextera DNA Flex library preparation kit and MiSeq reagent cartridge V2 [Illumina, San Diego, CA, USA]).
- Harvala H
- Frampton D
- Grant P
- et al.
and aligned to a selection of publicly available SARS-CoV-2 genomes
- Elbe S
- Buckland-Merrett G
using Mafft.
A read depth cutoff of ten was applied after assembly; genomes with less than 75% alignment coverage were removed from subsequent analysis. Phylogenetic trees were generated from multiple sequence alignments using IQ-Tree
- Nguyen L-T
- Schmidt HA
- von Haeseler A
- Minh BQ
and FigTree, with lineages assigned (including VOC calls) using pangolin and confirmed by manual inspection of alignments using Aliview.
The COG-UK Mutation Explorer was used to identify potential mutations of concern.
Association with clinical severity
A minimal common outcome measure set for COVID-19 clinical research.
The scale provides a measure of illness severity from 0 (not infected) to 10 (dead). The highest value of the WHO ordinal scale that was reached by day 14 after symptom onset or after first positive SARS-CoV-2 PCR if asymptomatic was recorded. Severe disease was defined as that which requires positive pressure respiratory support, thereby reaching point 6 or higher on the WHO ordinal scale. Additionally, in-hospital mortality data by day 28 after the first positive test were collected. Clinical outcome was defined as severe if the score on the WHO scale by day 14 after symptom onset or after first positive SARS-CoV-2 PCR was at least 6 or the patient was known to have died within 28 days. Clinical outcome was defined as non-severe if the score on the WHO scale by day 14 was less than 6 and with no in-hospital death by day 28. Treatment escalation plans are a form of advanced directive used in the UK to communicate a ceiling of care around organ support treatments. Documentation of a treatment escalation plan is recommended but not mandatory for all acute hospital admissions in the UK. Because of the effect a treatment escalation plan might have on the maximum degree of organ support received and, therefore, maximum ordinal scale point reached, documentation of a valid treatment escalation plan and the relevant limitation on ordinal scale progression were recorded.
- Gupta RK
- Harrison EM
- Ho A
- et al.
We used standard definitions
Surveillance definitions for COVID-19.
of community-acquired infection (symptoms or positive swab up to 2 days after admission), possible hospital-acquired infection (3–7 days after admission), probable hospital-acquired infection (8–14 days after admission) and definite hospital-acquired infection (≥15 days after admission). Admitted individuals were unlikely to have been vaccinated against COVID-19 because this study pre-dates the onset of the UK vaccination programme (appendix p 1).
Data were collected locally using a hospital laboratory information management system and electronic health record system and combined pseudo-anonymised for analysis. Logic, range, and missing data checks were done by the authors and queries verified against clinical records before analyses.
Statistical analysis
Coronavirus (COVID-19) infection survey, UK.
(ONS), and comparisons were made with UCLH and NMUH data using standard linear regression.
Univariable comparisons of categorical variables were done using χ2 or Fisher’s exact tests, or χ2 test for trend, and continuous variables were compared using the Wilcoxon-Mann-Whitney rank-sum test. Adjusted prevalence ratios (PRs) were estimated by fitting Poisson regression models with robust estimates to investigate associations between SARS-CoV-2 variant (B.1.1.7 vs non-B.1.1.7) and the outcome of severe disease or death, adjusting for potential confounders (hospital, age, sex, ethnicity, and comorbidity score). Wald tests were used to assess associations between the outcome and interaction terms between variant and hospital, age, and sex. Sensitivity analyses were done, first, limited to those without a treatment escalation plan documented, or whose treatment escalation plan was at WHO level 6 or higher; second, among those with symptoms or a positive test pre-dating hospital admission; and third, with inclusion of WHO level 5 (oxygen without positive pressure) in the outcome group.
Role of the funding source
The funder of the study had no role in the study design, data collection, data analysis, data interpretation, or writing of the report.
Results
Table 1Cohort characteristics by SARS-CoV-2 lineage
Data are n/N (%), unless otherwise indicated. p values were calculated using χ2 or Fisher’s exact tests, or χ2 test for ordinal variables. BMI=body-mass index. NMUH=North Middlesex University Hospital. UCLH=University College London Hospitals.
Overall, 92 (27%) of 339 patients had no identified treatment escalation plan in place, 221 (65%) had a treatment escalation plan with a specific maximum level, and 26 (8%) were missing information on the presence of a treatment escalation plan. 38 (17%) of 221 patients with a treatment escalation plan had restrictions limiting progression beyond ordinal scale level 5, of whom 24 (63%) of 38 had died.
Table 2Association of SARS-CoV-2 B.1.1.7 variant with disease severity
From available sample Ct value data for 27 B.1.1.7 samples and 38 non-B.1.1.7 samples, we found significantly lower Ct values associated with B.1.1.7 compared with non-B.1.1.7 (mean Ct 28·8 [SD 4·7] vs 32·0 [4·8]; p=0·0085). Correspondingly, we found significantly higher median genomic read depths in B.1.1.7 samples than in non-B.1.1.7 samples (mean median depths 1445 [952] vs 782 [728]; p=0·0030).
Coronavirus (COVID-19) infection survey, UK.
with corresponding reductions in non-B.1.1.7 sequences (linear regression, r2=0·90; p=0·0038) and PCR tests positive for ORF1ab, N, and S genes over time (r2=0·88; p=0·0054; figure 2).
Investigation of novel SARS-CoV-2 variant of concern 202012/01.
(first isolated Sept 20, 2020, in Kent, UK; GISAID ID EPI_ISL_601443) with regards to B.1.1.7 VOC-defining single nucleotide polymorphisms (SNPs) and deletions, with the exception of samples where sequencing had failed for the region of interest. Of 13 non-synonymous SNPs, key mutations included Asn501Tyr, A23063T) and P681H (C23604A, Pro681His) in the spike protein, a co-occurrence not previously observed. Asn501Tyr is a key contact residue in the receptor-binding domain, which has been shown to enhance angiotensin-converting enzyme 2 receptor affinity.
- Starr T
- Greaney AJ
- Hilton SK
- et al.
P681H forms part of a quartet of residues involved in creating a furin cleavage site between S1 and S2, promoting entry into lung cells and primary human airway epithelial cultures.
- Peacock TP
- Goldhill DH
- Zhou J
- et al.
All B.1.1.7 genomes contained a deletion at S 69–70, which causes reproducible S-gene target failure in the TaqPath assay and in conjunction with N501Y might account for increased transmissibility of the variant.
Investigation of novel SARS-CoV-2 variant of concern 202012/01.
Non-spike B.1.1.7 VOC-defining mutations included SNPs in N, ORF1ab, and ORF8 (including a premature stop codon at position 27) and six synonymous mutations observed across all samples.
Further sequence analysis confirmed this observation. Nucleotide diversity within all UCLH and NMUH VOC samples was approximately 4·5 nt, with 5·2 nt for UCLH and 3·9 nt for NMUH variants as individual sets, most having no clear epidemiological linkage. Similar analysis of B.1.177 lineage samples at UCLH during the same time period estimated their nucleotide diversity to be approximately two to three times higher (11·8 nt; 64 sequences). 156 (85%) of 182 B.1.1.7 samples linked by pairwise comparison (distance ≤2 nt) to another sample at the same hospital could also be linked to another sample at the other hospital.
- McCallum M
- De Marco A
- Lempp F
- et al.
but classified by COG-UK Mutation Explorer as there being low confidence in their effect. However, UK B.1.1.7 lineages have shown a rapid rise in acquisition of Leu18Phe,
Investigation of SARS-CoV-2 variants of concern in England: technical briefing 6.
and it is a conserved VOC-defining mutation for the P.1 VOC lineage first isolated in Brazil.
- Toovey OTR
- Harvey KN
- Bird PW
- Tang JWW
In this study, Leu18Phe was found in seven samples from three treated patients and one protracted viral shedder. In each case, the Leu18Phe mutations were present in the first successfully sequenced sample for each patient and were observed in all subsequent samples. We found no evidence that the mutations arose over the course of infection or in response to treatment.
Discussion
The emergence of novel VOCs in the ongoing SARS-CoV-2 pandemic requires rapid genomic, virological, epidemiological, and clinical characterisation to inform public health, clinical, and research responses. This study was done contemporaneously with the emergence and spread of the B.1.1.7 variant throughout the south of England and offers a unique and well characterised cohort of hospitalised patients. Within this cohort, which represents a substantial proportion of the hospitalised patients with COVID-19 in north-central London during this period, we found no evidence that the B.1.1.7 variant was associated with severe disease or death. One of the strengths of this study lies in its timing, which was several weeks before the peak of hospital admissions in London, and before any substantial resource limitation or strain on clinical care.
- Lam TT-Y
- Zhu H
- Guan Y
- Holmes EC
In the COVID-19 pandemic, two advances have facilitated this surveillance: the wider use of deep sequencing techniques and the availability of advanced bioinformatic tools and digital platforms giving immediate access for near real-time analysis.
- Hadfield J
- Megill C
- Bell SM
- et al.
,
- Argimón S
- Abudahab K
- Goater RJE
- et al.
Of particular concern are mutations relating to cross-species transmission, in the case of SARS-CoV-2 allowing for potential establishment of new animal reservoirs.
- Shang J
- Ye G
- Shi K
- et al.
International travel adds further complexity because population movement offers opportunities for variants to transmit worldwide. Variants must be rapidly assessed for their potential to increase transmission, to result in resistance to antiviral treatments and vaccines, and to alter the clinical phenotype, disease severity, and mortality.
,
Investigation of novel SARS-CoV-2 variant of concern 202012/01.
we investigated whether this characteristic is reflected by an increase in viral load, using Ct values from an in-house N-gene real-time RT-PCR assay and genomic read depths as surrogates. Although our Ct value analysis was limited by data availability, other studies have shown that NGS read counts can be used as a reliable predictor of viral load.
- Bonsall D
- Golubchik T
- de Cesare M
- et al.
Given that we found significant differences for Ct values and genomic read depths between B.1.1.7 and non-B.1.1.7 samples, we believe that B.1.1.7 infections were associated with higher viral loads than were non-B.1.1.7 infections in this study. This finding is in keeping with results from similar independent analyses, including that of approximately 1400 genomes assembled as part of the UK test and trace programme, which reported a 0·5 increase in median log10-inferred viral load in B.1.1.7 relative to non-B.1.1.7 samples.
- Golubchik T
- Lythgoe KA
- Hall M
- et al.
Our observed higher read depths are equivalent to a 0·2–0·3 increase in log10 read depth in B.1.1.7 relative to non-B.1.1.7, a smaller increase than observed in the previous study, which might be a consequence of sampling patients at later stages of infection than was done for test and trace swabs, which are typically derived from recently symptomatic individuals when viral loads are likely to be high.
- Singanayagam A
- Patel M
- Charlett A
- et al.
- Fajnzylber J
- Regan J
- Coxen K
- et al.
In our study, a greater proportion of the sequenced group had severe disease than of the unsequenced group (who were unsequenced as a result of having insufficient sample collected or failed sequencing). An underlying association between B.1.1.7 and disease severity in the hospitalised population overall might have been unobserved in our analyses, if those with B.1.1.7 were more likely to have a successful sequence because of a higher viral load or a greater number of samples available. However, we did not detect a trend in the proportion of sequences failing over time, as B.1.1.7 began to predominate in the population, or any correlation between proportion of B.1.1.7 samples among those sequenced and proportion of unsequenced samples in each week of the study.
consistent with other UK data and ours. The accumulation of 17 mutations suggests possible emergence of the lineage in an immunocompromised host, and although our data from immunocompromised and remdesivir-treated patients do not confirm this hypothesis, these populations will need to be monitored intensely as they receive vaccines, monoclonal antibodies, and other preventive and treatment modalities in the near future. However, our findings suggest that B.1.1.7 VOC-defining mutations do not arise solely in response to remdesivir treatment and are not more likely in immunocompromised patients in the absence of additional treatment. The L18F mutations we observed in several patients most likely reflect a higher prevalence of the mutation at the time of sampling before B.1.1.7 became the dominant lineage in the UK. The canonical B.1.1.7 VOC does not contain this mutation. Nevertheless, variants identified in South Africa and Brazil pose further concern, especially because they carry mutations with the potential to escape antibodies or vaccines and have been emerging in populations with presumed high seroprevalence. Indeed, an analysis of our data identified two B.1.1.7 isolates with the Glu484Lys (E484K) substitution (both otherwise identical to the canonical B.1.1.7 VOC reference genome), causing concern that the VOC is acquiring this mutation while circulating in the UK and might further spread.
The lower observed diversity within the SARS-CoV-2 sequenced genomes included in this study is consistent with B.1.1.7 transmission occurring more readily during early infection. A broader phylodynamic analysis over a longer timeframe accounting for sampling bias (eg, local outbreaks) would confirm whether the underlying rate of nucleotide substitution is genuinely lower for B.1.1.7 or, more likely, simply a reflection of a more recent most common ancestor.
- Rickman H M
- Rampling T
- Shaw K
- et al.
B.1.1.7 was seen more frequently in one of the two hospitals and significantly more frequently in ethnic minority groups than in White people. This finding could be explained by a difference in demographics and socioeconomic factors between patients in the two hospitals, suggesting a founder effect in this population at the time of B.1.1.7 VOC emergence.
Update note on B.1.1.7. severity, 11 February.
In our study, older age remained associated with severe outcome or death in adjusted analyses, although no difference between lineages was reported. Further community-based studies should be done to allow a larger denominator unselected by disease severity, to investigate any association between B.1.1.7 and the probability of hospitalisation or small differences in virulence that might occur in individuals with pauci-symptomatic or asymptomatic infection. This association might be of particular relevance when investigating effects potentially confounded by age because minimally symptomatic infections occur more frequently in younger individuals than in older individuals. This study was able to rule out a difference of 1·85 or greater increased odds of severe disease. More subtle associations with severity have been reported
Update note on B.1.1.7. severity, 11 February.
but in different types of community cohorts that do not allow for direct comparison.
A minimal common outcome measure set for COVID-19 clinical research.
was captured within 14 days after a positive test or onset of symptoms in this study, allowing sufficient time for deterioration, given the median time to clinical deterioration in a large observational study was 4 days (IQR 1–9) after admission.
- Gupta RK
- Harrison EM
- Ho A
- et al.
Some patients might have deteriorated after day 14 and the outcome missed, but this possibility was mitigated by capturing death at day 28 for hospitalised patients. Some individuals might have been discharged and died either at home or another site and their outcome not captured. Finding B.1.1.7 more commonly in younger versus older individuals gives a subtle hint of more severe disease if patients with B.1.1.7 are hospitalised more often compared with patients with other lineages, although difference in disease severity by B.1.1.7 was not found in this hospitalised cohort in the main analysis. In sensitivity analyses further exploring the non-severe group, compared with those with non-B.1.1.7, those with B.1.1.7 were more likely to receive oxygen without positive pressure, and this difference persisted in adjusted analyses. However, we are cautious in the interpretation of this finding because of the limitations of oxygen without positive pressure as a measure of disease severity (with its use possibly being selected by reasons for hospitalisation unrelated to COVID-19 or residual confounding by other patient characteristics). Further, we found no clear pattern towards more severe disease in the other ordinal scale levels in the B.1.1.7 group. We acknowledge comparison of outcomes between groups have not corrected for treatments including use of steroids, antiviral medications, tocilizumab, and convalescent plasma. Also, some patients possibly met our outcome definition by receiving oxygen with positive pressure or ventilation for reasons other than COVID-19.
Rapid collection of good quality clinical data with the appropriate granularity, in combination with whole-genome sequencing of SARS-CoV-2, is imperative in deciding whether variants are associated with altered clinical outcomes. These data, in conjunction with in-vitro investigation of neutralisation capacity of sera from individuals following vaccination and natural infection, are essential in the public health response and clinical management of COVID-19. Large readily available datasets will be key in enabling rapid clinical assessment of variants. Our data, within the context and limitations of a real-world study, provide initial reassurance that severity in hospitalised patients with B.1.1.7 is not markedly different from severity in those without, and this study provides a model to answer the same question again as we move into an era of emerging variants.
EN, DF, CFH, TR, AC, and HB drafted the first version of the manuscript. JH did all sample extractions and sequencing. TR, AC, RScot, JP, and CFH did the clinical data extraction, verification, and curation. DF and MBy designed and did the bioinformatic analyses. CFH, HB, RScon, and EN designed and did the cohort study analysis. All authors provided data or contributed to the writing of the manuscript and approved the final version. All authors had full access to all the data in the study and had final responsibility for the decision to submit for publication.