No CrossRef data available.
Article contents
6 Exploring data scraping on ClinicalTrials.gov to identify key variables to include in an EHR-based recruitment tool
Published online by Cambridge University Press: 11 April 2025
Abstract
Objectives/Goals: Failure to achieve recruitment goals results in termination of ~20% of clinical trials and delays >85% of trial timelines. We aim to develop an electronic heath record (EHR)-based recruitment tool to ease identification of participants. We sought to determine whether criteria listed on clinicaltrials.gov could support selection of tool variables. Methods/Study Population: To inform the variables to include in the EHR-based recruitment tool, we data scraped clinicaltrials.gov to identify key inclusion and exclusion criteria common across a variety of diabetes clinical trials. We included actively recruiting or recently active phase 2 and 3 clinical trials of adults aged >18 years of age in the USA. We classified identified variables as clinically relevant or not and compared clinically relevant terms with inclusion and exclusion criteria (~20 variables) that were individually identified by three diabetes clinical trialists and two clinical research coordinators (CRCs). Results/Anticipated Results: We reviewed 203 clinical trials listed on clinicaltrials.gov. We identified 115 terms, 91 of which were clinically relevant. Three of 3 clinical trialists, 1 of 2 CRCs, and all trials listed age as a key variable. Consistent with data scraping, all trialists and CRCs identified glucose-lowering medications and kidney function as important criteria. Gender, ethnicity, and race were less commonly noted on clinicaltrials.gov and listed by 2 of 3 trialists and one CRC. Cardiovascular conditions (e.g., history of myocardial infarction), thyroid function tests, and contraceptive requirements were common criteria on clinicaltrials.gov, but only one trialist and one CRC identified these variables. Active infections (e.g., HIV) and c-peptide were not highlighted by trialists or CRCs but common on clinicaltrials.gov. Discussion/Significance of Impact: An EHR-based recruitment tool may facilitate identification of trial participants, but identifying key variables to include is essential. We found that data scraping for variables on clinicaltrials.gov mostly aligned with expert opinion, suggesting that automating variable selection via extraction from clinicaltrials.gov may be acceptable.
- Type
- Contemporary Research Challenges
- Information
- Creative Commons
- This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives licence (https://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is unaltered and is properly cited. The written permission of Cambridge University Press must be obtained for commercial re-use or in order to create a derivative work.
- Copyright
- © The Author(s), 2025. The Association for Clinical and Translational Science