OP218 Searching Preprint Repositories For COVID-19 Therapeutics Using A Semi-Automated Text-Mining Tool

Sonia Garcia Gonzalez-Moral; Aalya Al-Assaf; Savitri Pandey; Oladapo Ogunbayo; Dawn Craig

doi:10.1017/S0266462321000799

Introduction

The COVID-19 pandemic led to a significant surge in clinical research activities in the search for effective and safe treatments. Attempting to disseminate early findings from clinical trials in a bid to accelerate patient access to promising treatments, a rise in the use of preprint repositories was observed. In the UK, NIHR Innovation Observatory (NIHRIO) provided primary horizon-scanning intelligence on global trials to a multi-agency initiative on COVID-19 therapeutics. This intelligence included signals from preliminary results to support the selection, prioritisation and access to promising medicines.

Methods

A semi-automated text mining tool in Python3 used trial IDs (identifiers) of ongoing and completed studies selected from major clinical trial registries according to pre-determined criteria. Two sources, BioRxiv and MedRxiv are searched using the IDs as search criteria. Weekly, the tool automatically searches, de-duplicates, excludes reviews, and extracts title, authors, publication date, URL and DOI. The output produced is verified by two reviewers that manually screen and exclude studies that do not report results.

Results

A total of 36,771 publications were uploaded to BioRxiv and MedRxiv between March 3 and November 9 2020. Approximately 20–30 COVID-19 preprints per week were pre-selected by the tool. After manual screening and selection, a total of 123 preprints reporting clinical trial preliminary results were included. Additionally, 50 preprints that presented results of other study types on new vaccines and repurposed medicines for COVID-19 were also reported.

Conclusions

Using text mining for identification of clinical trial preliminary results proved an efficient approach to deal with the great volume of information. Semi-automation of searching increased efficiency allowing the reviewers to focus on relevant papers. More consistency in reporting of trial IDs would support automation. A comparison of accuracy of the tool on screening titles/abstract or full papers may help to support further refinement and increase efficiency gains.

This project is funded by the NIHR [(HSRIC-2016-10009)/Innovation Observatory]. The views expressed are those of the author(s) and not necessarily those of the NIHR or the Department of Health and Social Care.

Article contents

OP218 Searching Preprint Repositories For COVID-19 Therapeutics Using A Semi-Automated Text-Mining Tool

Abstract

This article has been cited by the following publications. This list is generated based on data provided by Crossref.

Article contents

OP218 Searching Preprint Repositories For COVID-19 Therapeutics Using A Semi-Automated Text-Mining Tool

Abstract

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests