Creating enriched training sets of eligible studies for large systematic reviews: the utility of PubMed's Best Match algorithm
Published online by Cambridge University Press: 18 December 2020
Abstract
Solutions like crowd screening and machine learning can assist systematic reviewers with heavy screening burdens but require training sets containing a mix of eligible and ineligible studies. This study explores using PubMed's Best Match algorithm to create small training sets containing at least five relevant studies.
Six systematic reviews were examined retrospectively. MEDLINE searches were converted and run in PubMed. The ranking of included studies was studied under both Best Match and Most Recent sort conditions.
Retrieval sizes for the systematic reviews ranged from 151 to 5,406 records and the numbers of relevant records ranged from 8 to 763. The median ranking of relevant records was higher in Best Match for all six reviews, when compared with Most Recent sort. Best Match placed a total of thirty relevant records in the first fifty, at least one for each systematic review. Most Recent sorting placed only ten relevant records in the first fifty. Best Match sorting outperformed Most Recent in all cases and placed five or more relevant records in the first fifty in three of six cases.
Using a predetermined set size such as fifty may not provide enough true positives for an effective systematic review training set. However, screening PubMed records ranked by Best Match and continuing until the desired number of true positives are identified is efficient and effective.
The Best Match sort in PubMed improves the ranking and increases the proportion of relevant records in the first fifty records relative to sorting by recency.
- Type
- Method
- Information
- Copyright
- Copyright © The Author(s), 2020. Published by Cambridge University Press
References
- 2
- Cited by