Published online by Cambridge University Press: 03 December 2021
The COVID-19 pandemic led to a significant surge in clinical research activities in the search for effective and safe treatments. Attempting to disseminate early findings from clinical trials in a bid to accelerate patient access to promising treatments, a rise in the use of preprint repositories was observed. In the UK, NIHR Innovation Observatory (NIHRIO) provided primary horizon-scanning intelligence on global trials to a multi-agency initiative on COVID-19 therapeutics. This intelligence included signals from preliminary results to support the selection, prioritisation and access to promising medicines.
A semi-automated text mining tool in Python3 used trial IDs (identifiers) of ongoing and completed studies selected from major clinical trial registries according to pre-determined criteria. Two sources, BioRxiv and MedRxiv are searched using the IDs as search criteria. Weekly, the tool automatically searches, de-duplicates, excludes reviews, and extracts title, authors, publication date, URL and DOI. The output produced is verified by two reviewers that manually screen and exclude studies that do not report results.
A total of 36,771 publications were uploaded to BioRxiv and MedRxiv between March 3 and November 9 2020. Approximately 20–30 COVID-19 preprints per week were pre-selected by the tool. After manual screening and selection, a total of 123 preprints reporting clinical trial preliminary results were included. Additionally, 50 preprints that presented results of other study types on new vaccines and repurposed medicines for COVID-19 were also reported.
Using text mining for identification of clinical trial preliminary results proved an efficient approach to deal with the great volume of information. Semi-automation of searching increased efficiency allowing the reviewers to focus on relevant papers. More consistency in reporting of trial IDs would support automation. A comparison of accuracy of the tool on screening titles/abstract or full papers may help to support further refinement and increase efficiency gains.
This project is funded by the NIHR [(HSRIC-2016-10009)/Innovation Observatory]. The views expressed are those of the author(s) and not necessarily those of the NIHR or the Department of Health and Social Care.