Hostname: page-component-586b7cd67f-tf8b9 Total loading time: 0 Render date: 2024-11-30T20:11:34.387Z Has data issue: false hasContentIssue false

Introduction to Machine Learning in Digital Healthcare Epidemiology

Published online by Cambridge University Press:  05 November 2018

Jan A. Roth
Affiliation:
Division of Infectious Diseases and Hospital Epidemiology, University Hospital Basel, Basel, Switzerland Basel Institute for Clinical Epidemiology and Biostatistics, University Hospital Basel, Basel, Switzerland
Manuel Battegay
Affiliation:
Division of Infectious Diseases and Hospital Epidemiology, University Hospital Basel, Basel, Switzerland
Fabrice Juchler
Affiliation:
Division of Infectious Diseases and Hospital Epidemiology, University Hospital Basel, Basel, Switzerland
Julia E. Vogt
Affiliation:
Adaptive Systems and Medical Data Science, Department of Mathematics and Computer Science, University of Basel, Basel, Switzerland Swiss Institute of Bioinformatics, Basel, Switzerland
Andreas F. Widmer*
Affiliation:
Division of Infectious Diseases and Hospital Epidemiology, University Hospital Basel, Basel, Switzerland
*
Author for correspondence: Andreas F. Widmer, MD, MS, Division of Infectious Diseases and Hospital Epidemiology, University Hospital Basel, Petersgraben 4, 4031 Basel, Switzerland. E-mail: [email protected]

Abstract

To exploit the full potential of big routine data in healthcare and to efficiently communicate and collaborate with information technology specialists and data analysts, healthcare epidemiologists should have some knowledge of large-scale analysis techniques, particularly about machine learning. This review focuses on the broad area of machine learning and its first applications in the emerging field of digital healthcare epidemiology.

Type
Review
Copyright
© 2018 by The Society for Healthcare Epidemiology of America. All rights reserved. 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

a

Authors of equal contribution.

Cite this article: Roth JA, et al. (2018). Introduction to Machine Learning in Digital Healthcare Epidemiology. Infection Control & Hospital Epidemiology 2018, 39, 1457–1462. doi: 10.1017/ice.2018.265

References

1. Sydnor, ER, Perl, TM. Hospital epidemiology and infection control in acute-care settings. Clin Microbiol Rev 2011;24:141173.Google Scholar
2. Simmons, BP, Parry, MF, Williams, M, Weinstein, RA. The new era of hospital epidemiology: what you need to succeed. Clin Infect Dis 1996;22:550553.Google Scholar
3. Wiens, J, Shenoy, ES. Machine learning for healthcare: on the verge of a major shift in healthcare epidemiology. Clin Infect Dis 2018;66:149153.Google Scholar
4. Ross, MK, Wei, W, Ohno-Machado, L. “Big data” and the electronic health record. Yearb Med Inform 2014;9:97104.Google Scholar
5. Bates, DW, Saria, S, Ohno-Machado, L, Shah, A, Escobar, G. Big data in health care: using analytics to identify and manage high-risk and high-cost patients. Health Aff 2014;33:11231131.Google Scholar
6. Moore, GE. Cramming more components onto integrated circuits. Electronics 1965;38:114117.Google Scholar
7. Jordan, MI, Mitchell, TM. Machine learning: trends, perspectives, and prospects. Science 2015;349:255260.Google Scholar
8. Salathé, M. Digital epidemiology: What is it, and where is it going? Life Sci Soc Policy 2018;14:1.Google Scholar
9. Salathé, M. Digital pharmacovigilance and disease surveillance: combining traditional and big-data systems for better public health. J Infect Dis 2016;214:S399S403.Google Scholar
10. Salathé, M, Freifeld, CC, Mekaru, SR, Tomasulo, AF, Brownstein, JS. Influenza A (H7N9) and the importance of digital epidemiology. N Engl J Med 2013;369:401404.Google Scholar
11. Sips, ME, Bonten, MJM, van Mourik, MSM. Automated surveillance of healthcare-associated infections: state of the art. Curr Opin Infect Dis 2017;30:425431.Google Scholar
12. Dolley, S. Big data’s role in precision public health. Front Public Health 2018;6:68.Google Scholar
13. Kruse, CS, Goswamy, R, Raval, Y, Marawi, S. Challenges and opportunities of big data in health care: a systematic review. JMIR Med Inform 2016;4:e38.Google Scholar
14. Kaplan, RM, Chambers, DA, Glasgow, RE. Big data and large sample size: a cautionary note on the potential for bias. Clin Transl Sci 2014;7:342346.Google Scholar
15. Gray, EA, Thorpe, JH. Comparative effectiveness research and big data: balancing potential with legal and ethical considerations. J Comp Eff Res 2015;4:6174.Google Scholar
16. Samuel, AL. Some studies in machine learning using the game of checkers. IBM J Res Devel 1959;3:210229.Google Scholar
17. Cox, D. The regression analysis of binary sequences. J Roy Stat Soc 1958:215242.Google Scholar
18. Goodfellow, I, Bengio, Y, Courville, A. Deep Learning, 1st ed. Cambridge, MA: MIT Press; 2016.Google Scholar
19. Breiman, L. Random forests. Machine Learn 2001;45:532.Google Scholar
20. Oh, J, Makar, M, Fusco, C, et al. A generalizable, data-driven approach to predict daily risk of Clostridium difficile infection at two large academic health centers. Infect Control Hosp Epidemiol 2018;39:425433.Google Scholar
21. Davis, SE, Lasko, TA, Chen, G, Siew, ED, Matheny, ME. Calibration drift in regression and machine learning models for acute kidney injury. J Am Med Inform Assoc 2017;24:10521061.Google Scholar
22. Escobar, GJ, Baker, JM, Kipnis, P, et al. Prediction of recurrent Clostridium difficile infection using comprehensive electronic medical records in an integrated healthcare delivery system. Infect Control Hosp Epidemiol 2017;38:11961203.Google Scholar
23. Sherman, E, Gurm, H, Balis, U, Owens, S, Wiens, J. Leveraging clinical time-series data for prediction: a cautionary tale. AMIA Annu Symp Proc 2017;2017:15711580.Google Scholar
24. Neugebauer, R, Schmittdiel, JA, van der Laan, MJ. A case study of the impact of data-adaptive versus model-based estimation of the propensity scores on causal inferences from three inverse probability weighting estimators. Int J Biostat 2016;12:131155.Google Scholar
25. Lippert, C, Casale, FP, Rakitsch, B, Stegle, O. LIMIX: Genetic analysis of multiple traits. bioRxiv 2014.Google Scholar
26. Lippert, C, Listgarten, J, Liu, Y, Kadie, CM, Davidson, RI, Heckerman, D. Fast linear mixed models for genome-wide association studies. Nature Methods 2011;8:833.Google Scholar
27. Li, L, Rakitsch, B, Borgwardt, K. CcSVM: Correcting support vector machines for confounding factors in biological data classification. Bioinformatics 2011;27:i342348.Google Scholar
28. Beeler, C, Dbeibo, L, Kelley, K, et al. Assessing patient risk of central line-associated bacteremia via machine learning. Am J Infect Control 2018;46:986991.Google Scholar
29. Parreco, JP, Hidalgo, AE, Badilla, AD, Ilyas, O, Rattan, R. Predicting central line-associated bloodstream infections and mortality using supervised machine learning. J Crit Care 2018;45:156162.Google Scholar
30. Savin, I, Ershova, K, Kurdyumova, N, et al. Healthcare-associated ventriculitis and meningitis in a neuro-ICU: incidence and risk factors selected by machine learning approach. J Crit Care 2018;45:95104.Google Scholar
31. Cook, JA, Collins, GS. The rise of big clinical databases. Br J Surg 2015;102:e93e101.Google Scholar
32. Benchimol, EI, Smeeth, L, Guttmann, A, et al. The reporting of studies conducted using observational routinely-collected health data (RECORD) statement. PLoS Med 2015;12:e1001885.Google Scholar
33. Ford, E, Carroll, JA, Smith, HE, Scott, D, Cassell, JA. Extracting information from the text of electronic medical records to improve case detection: a systematic review. J Am Med Inform Assoc 2016;23:10071015.Google Scholar
34. Cabitza, F, Rasoini, R, Gensini, GF. Unintended consequences of machine learning in medicine. JAMA 2017;318:517518.Google Scholar
35. Beam, AL, Kohane, IS. Big data and machine learning in health care. JAMA 2018;319:13171318.Google Scholar
36. Jarow, JP, LaVange, L, Woodcock, J. Multidimensional evidence generation and FDA regulatory decision making: defining and using “real-world” data. JAMA 2017;318:703704.Google Scholar
37. Baro, E, Degoul, S, Beuscart, R, Chazard, E. Toward a literature-driven definition of big data in healthcare. Biomed Res Int 2015;2015:639021.Google Scholar
38. Allen, C, Tsou, M-H, Aslam, A, Nagel, A, Gawron, J-M. Applying GIS and machine learning methods to twitter data for multiscale surveillance of influenza. PLoS One 2016;11:e0157734.Google Scholar
39. Ehrentraut, C, Ekholm, M, Tanushi, H, Tiedemann, J, Dalianis, H. Detecting hospital-acquired infections: a document classification approach using support vector machines and gradient tree boosting. Health Informatics J 2018;24:2442.Google Scholar
40. Kuo, P-J, Wu, S-C, Chien, P-C, et al. Artificial neural network approach to predict surgical site infection after free-flap reconstruction in patients receiving surgery for head and neck cancer. Oncotarget 2018;9:1376813782.Google Scholar
41. Ferdoash, A. Letter to the editor: Predicting central-line–associated bloodstream infections and mortality using supervised machine learning. J Crit Care 2018;46:162.Google Scholar
42. Sanger, PC, van Ramshorst, GH, Mercan, E, et al. A prognostic model of surgical site infection using daily clinical wound assessment. J Am Coll Surg 2016;223:259270.Google Scholar
43. Gómez-Vallejo, HJ, Uriel-Latorre, B, Sande-Meijide, M, et al. A case-based reasoning system for aiding detection and classification of nosocomial infections. Decision Support Syst 2016;84:104116.Google Scholar
44. Lu, FS, Hou, S, Baltrusaitis, K, et al. Accurate influenza monitoring and forecasting using novel internet data streams: a case study in the Boston metropolis. JMIR Public Health Surveill 2018;4:e4.Google Scholar
45. Santillana, M, Nguyen, AT, Dredze, M, Paul, MJ, Nsoesie, EO, Brownstein, JS. Combining search, social media, and traditional data sources to improve influenza surveillance. PLoS Comput Biol 2015;11:e1004513.Google Scholar
46. Sohn, S, Larson, DW, Habermann, EB, Naessens, JM, Alabbad, JY, Liu, H. Detection of clinically important colorectal surgical site infection using bayesian network. J Surg Res 2017;209:168173.Google Scholar
47. Pak, TR, Chacko, KI, O’Donnell, T, et al. Estimating local costs associated with Clostridium difficile infection using machine learning and electronic medical records. Infect Control Hosp Epidemiol 2017;38:14781486.Google Scholar