Introduction
ST-segment elevation myocardial infarction (STEMI) is a time-sensitive condition that requires prompt diagnosis. Time from onset of symptoms to definitive treatment is directly related to patient survival, with current guidelines recommending percutaneous coronary intervention (PCI) within 90 minutes of emergency department (ED) arrival (“door-to-balloon” or D2B). Reference Bossaert, O’Connor and Arntz1–Reference Cannon, Gibson and Lambrew3 Prehospital acquisition of electrocardiograms (ECGs) is consistently associated with reduced time to treatment of STEMI due to pre-activation of the cardiac catheterization lab (CCL) and is a Class I American Heart Association (Dallas, Texas USA) recommendation. Reference Bossaert, O’Connor and Arntz1,Reference Brown, Mahmud, Dunford and Ben-Yehuda4–10
Currently, Emergency Medical Services (EMS) systems use a variety of methods for ECG interpretation, including paramedic interpretation, software interpretation, and transmission of the ECG for interpretation by a remote physician, or some combination of the three. While paramedic ECG interpretation with sufficient training has been shown to be non-inferior to that of an emergency physician, training to achieve this level of competence is time and resource intensive, making it infeasible for many EMS systems. Reference Feldman, Brinsfield, Bernard, White and Maciejko11 Transmission of the ECG to a physician is a commonly applied solution, but this can be unreliable due to technical difficulties, such as poor wireless data signal. Reference D’Arcy, Bosson and Kaji12–Reference Sejersten, Sillesen and Hansen14 Additionally, reliance on ED physician interpretation creates additional interruptions and increases physician workload, both of which are hypothesized to impact ED patient care. Reference Westbrook, Raban, Walter and Douglas15–Reference Chisholm, Dornfeld, Nelson and Cordell18 Software algorithms are an easy solution, widely available on the cardiac monitor, and can be highly accurate. Reference Tanaka, Matsuo and Kikuchi19 However, if applied in isolation, they often result in a high rate of false-positive activation given the wide use of screening prehospital ECGs. Reference Tanaka, Matsuo and Kikuchi19–Reference Bhalla, Mencl, Gist, Wilber and Zalewski26 The authors previously derived a set of criteria to optimize specificity of software interpretation of STEMI from the literature and applied it to a small dataset. Reference Goebel, Vaida, Kahn and Donofrio27 In this study, the authors aim to apply these criteria to a large external dataset with known outcomes in order to classify software interpretations of STEMI as either true positive or false positive to validate criteria that optimize specificity and can easily be applied by a paramedic.
In the EMS setting, it may be desirable to identify a subgroup of patients who could bypass the need for physician over-read and reduce time to definitive care. The authors hypothesize that for EMS patients with an ECG software interpretation of STEMI, a set of simple criteria can be used, based on common causes of false-positive interpretations, to improve the specificity of the software interpretation. The authors further hypothesize that through this combination of clinician and machine, a subset of STEMI patients can be identified with specificity and positive likelihood ratio sufficient for prehospital CCL activation by paramedics without physician over-read, where those that fall out of the algorithm default to the current standard of care for cases with a high suspicion for STEMI. The objective of this study was to determine the accuracy of adding this set of criteria to STEMI-positive ECGs, compared to software interpretation alone, in a population with known outcomes.
Methods
Study Design
A retrospective analysis was performed using a large dataset of previously collected consecutive cases with prehospital 12-lead ECGs recorded by a single large urban EMS agency. This study was approved by the University of Massachusetts Chan Medical School - Baystate (Springfield, Massachusetts USA) institutional review board having been deemed not human subjects research (protocol number BH-22-186).
Population and Setting
The Los Angeles Fire Department (LAFD; Los Angeles, California USA) is the 9-1-1 EMS provider for the city of Los Angeles, serving a population of four million, with over 200,000 transports annually. The LAFD is one of 28 municipal fire departments operating in Los Angeles County, which has a regional cardiac care system comprised of 34 hospitals designated as STEMI receiving centers (SRCs). Reference Eckstein, Koenig, Kaji and Tadeo28 Paramedics acquire 12-lead ECGs on all patients with chest pain, discomfort, or other symptoms in whom paramedics suspect a cardiac etiology, as well as patients at high-risk for an acute cardiac event based on medical history, patients with new dysrhythmia, and patients resuscitated from cardiac arrest. At the time of the acquisition of these cases, paramedics used the LIFEPAK 15 (LP15; Stryker Medical, Kalamazoo, Michigan USA) monitor’s interpretation produced by the University of Glasgow (Glasgow, Scotland) ECG analysis program (Version 27) to identify a possible STEMI and assess the quality of the tracing. If the software generated the statement “∗∗∗ MEETS ST ELEVATION MI CRITERIA ∗∗∗,” the patient was triaged as a STEMI. The ECG tracing is wirelessly transferred to a tablet computer, which transmits the tracing over a cellular data connection to the SRC. The ECG transmission occurs separate of the patient care record. Paramedics then call to notify the SRC. All SRCs reported patient outcomes to a single registry maintained by the Los Angeles County EMS Agency, as previously described. Reference Bosson, Kaji and Niemann29 All patients transported by LAFD paramedics with a possible STEMI identified either prehospital or in the ED are included in the registry.
During the study period of July 2011 through June 2012, LAFD clinicians documented patient encounters using the HealthEMS electronic patient care record (ePCR) system (Stryker Medical; Kalamazoo, Michigan USA) and used the LP15 monitor. Adult patients (age 18 years or older) were included if the EMS case was in the ePCR system and had at least one 12-lead ECG interpreted by the LP15 algorithm. Patients less than 18 years of age were excluded because the LP15 does not give a STEMI interpretation for these patients. Additionally, interfacility transfer cases were excluded. Only a single ECG was used from each patient encounter, selected based on the first ECG of adequate quality. The ECG selection methodology is described in the previous publication that detailed the creation of this dataset. Reference Bosson, Sanko and Stickney30
Measurements
In the dataset, each case was classified as to whether emergent coronary angiography was indicated based on hospital data in the SRC registry by following the same classification method used by prior investigators. Reference Squire, Tamayo-Sarver, Rashi, Koenig and Niemann31 Cases were classified as “emergent coronary angiography indicated” if the SRC registry confirmed any one of the following outcomes: PCI was performed; or PCI was not performed due to the need for coronary artery bypass grafting, intra-aortic balloon pump placement, difficult catheterization, multivessel coronary artery disease, coronary vasospasm, or patient death. Cases were also classified as “emergent coronary angiography indicated” if the CCL was cancelled or not activated due to advanced age, allergy to contrast, CCL not available, presence of a do not resuscitate order, comorbidity, refusal of treatment, or transfer (ie, were it not for the presence of a specific condition or circumstance, then the patient would have gone to the CCL). Cases were classified as “emergent coronary angiography not indicated” if any of the following were true: the SRC data included a completed catheterization with no lesion and no vasospasm reported; the SRC data indicated that the CCL was cancelled or not activated due to physician interpretation of not STEMI or poor-quality prehospital ECG; or the patient with a field ECG interpretation of not STEMI was not found in the SRC registry, as the SRC database is inclusive of all cases of STEMI diagnosed in the field or SRC EDs.
For cases in which the LP15 interpretation was STEMI but the outcome was not available in the registry, three cardiologists blinded to the patients’ treatment and outcome each independently classified the ECG as to whether emergent coronary angiography was indicated. A simple majority was used for any disagreements. The cardiologists’ interrater reliability was evaluated using Fleiss’ kappa.
Outcome Measures
For this analysis, the authors sought to validate a set of criteria derived from the literature and previously applied to a small dataset, designed to exclude false-positive software interpretations of STEMI (Table 1). Reference Sanko, Eckstein and Bosson22,Reference Kado, Wilson, Strom and Box25,Reference Bosson, Sanko and Stickney30,Reference Coffey, Serra, Goebel, Espinoza, Castillo and Dunford32 Criteria suggesting a true positive were: free of artifact or baseline wander, ≥ one millimeters (mm) of ST-segment elevation in ≥ two contiguous leads, heart rate < 130 beats per minute, and QRS duration < 100 milliseconds. Reference de Champlain, Boothroyd and Vadeboncoeur23,Reference Goebel, Vaida, Kahn and Donofrio27,Reference Bosson, Sanko and Stickney30,Reference Swan, Nighswonger, Boswell and Stratton33,Reference Pilbery, Teare, Goodacre and Morris34 A case was considered true positive if it met all four criteria. The primary outcome was the test characteristics of the complete set of four criteria when applied to a prehospital ECG software interpretation of STEMI in order to identify true positive STEMI. Secondary analyses evaluated the association of the number of criteria used and the test characteristics.
Abbreviation: STEMI, ST-segment elevation myocardial infarction.
The original dataset did not contain human-derived reports of the presence of ST-segment elevation or the presence of artifact on each ECG. These data points were generated post-hoc by a team of medical students, resident physicians, and paramedics. All raters received training by the primary investigator, which included a standardized training video followed by a practice rating session using a standard set of ECGs. After training, each rater was assigned a portion of the total ECGs with overlap such that each ECG was given a binary rating by three independent raters for both the presence of artifact and the presence of ST-segment elevation. A simple majority was used as the final rating for each category. Interrater reliability was evaluated using Fleiss’ kappa for both the artifact and ST-segment elevation ratings.
Statistical Analyses
Statistical analysis was performed in R (R Core Team; Vienna, Austria) using the Tidyverse and Tidymodels packages. Reference Wickham, Averick and Bryan35,Reference Kuhn and Wickham36 The test characteristics of sensitivity, specificity, area under the receiver operator curve (AUC), and positive and negative likelihood ratios, along with their 95% confidence intervals, were determined. All four criteria were applied to cases the software identified as STEMI and additionally tested combinations of one, two, or three criteria as different versions of the algorithm in addition to the software interpretation alone. Contingency tables (2x2) were created to compare each version to the gold standard of appropriate CCL activation. Linear regression was used to test for associations between test characteristics and the number of criteria used in each version of the algorithm.
Results
The dataset included a total of 44,611 prehospital cases with associated ECGs. Of these, 1,193 had a software interpretation of STEMI. Outcomes were missing in 299 of these cases, and the cardiologists’ consensus was used. There were no cases of missing ECG measurements. There were no missing labels for artifact or ST-segment elevation, as these were done post-hoc. Based on the definition of “emergency coronary angiography indicated,” 529 patients were ultimately classified as STEMI and 44,082 as not STEMI. A detailed breakdown of cases by computer interpretation of STEMI, unmet criteria, and outcomes can be found in the Supplementary Material (available online only). The cohort characteristics are found in Table 2. The patients in the STEMI and non-STEMI groups were predominantly white (62% versus 54%), predominately male (72% versus 50%), and in their mid-60s (median age 63 versus 65). For cases that did not meet all four criteria, the most common unmet criterion was heart rate, followed by ST-segment elevation, QRS duration, and finally artifact (Figure 1). The interrater reliability for rating artifact was poor (Fleiss’ kappa -0.12; P < .01), but ST-segment elevation agreement was moderate (Fleiss’ kappa 0.39; P < .01). The interrater reliability of the three cardiologists’ decision to activate the CCL was moderate (Fleiss’ kappa 0.43; P < .01).
Abbreviation: STEMI, ST-segment elevation myocardial infarction.
a Median (IQR); n (%).
b “Unknown” not included in known statistics.
Table 3 shows the sensitivity, specificity, positive and negative likelihood radios, and AUC for each combination of the four criteria. The version of the algorithm using all four criteria (V1) had the highest positive likelihood ratio (95% CI, 210-595) and specificity (95% CI, 99.93-99.98), but the lowest sensitivity (95% CI, 11-17). The AUC for this version was 0.57 (95% CI, 0.55-0.58; Figure 2). Overall, 89 cases met all four criteria (7.5% overall). There was a positive correlation between number of criteria used and positive likelihood ratio (adjusted r2 = 0.90), as well as specificity (adjusted r2 = 0.85). There was a negative correlation between number of criteria used and sensitivity (adjusted r2 = −0.94) and AUC (adjusted r2 = −0.95). The software interpretation alone (algorithm V0) demonstrated a sensitivity of 91% (95% CI, 88-93), specificity 98% (95% CI, 98-99), positive likelihood ratio 56 (95% CI, 52-61), and AUC of 0.95 (95% CI, 0.94−0.96).
Note: Estimate (95% CI).
Abbreviations: LR, likelihood ratio; AUC, area under receiver operator curve; ART, artifact; HR, heart rate; QRS, QRS duration; STE, ST-segment elevation.
Linear regression showed small but statistically significant associations between each criterion and AUC, positive likelihood ratio, sensitivity, and specificity (ie, that adding additional criteria resulted in incremental gains of test characteristics; see Supplementary Material for detailed regression output). The presence of ST-segment elevation on the positive likelihood ratio had a larger effect size than other criteria on other test characteristics. For any given test characteristic, the effect sizes were usually small but with the same directionality: negative for sensitivity and AUC, positive for positive likelihood ratio and specificity (Figure 3).
Discussion
In this retrospective cohort study, these results suggest that adding four human-applied criteria can improve the specificity of the software interpretation of STEMI compared to software interpretation alone. The presence of all four criteria increased the likelihood of true positive STEMI, with a positive likelihood ratio of 353 and 99.96% specificity. Given the high specificity, these criteria could allow independent one-way decision making by paramedics to activate the CCL without relying on transmission for physician over-read, potentially reducing time to intervention for a subgroup of STEMI patients. While this comes at a significant cost in sensitivity, patients with a software interpretation of STEMI that do not meet all four criteria could default to the current standard of care for cases with a high suspicion of STEMI. In this cohort, utilizing all four criteria to activate the CCL would result in 97.6% reduction in false positives compared to using the software interpretation alone (711 to 17). For patients meeting all criteria, direct activation from the field without physician over-read would be highly accurate and expedite definitive care. Even with a very low pre-test probability of STEMI, cases with a software interpretation of STEMI that meet all four criteria are extremely likely to require emergency angiography.
While other criteria have been shown to increase the accuracy of the identification of STEMI, these are not interpretable by a human and would be better suited to revising software models. Reference Wu, Zhou and Liu37–Reference Klein, Shroff, Beeman and Smith42 These four criteria together, easily interpreted and applied by EMS providers, had greater specificity than any other combination. The presence of EMS clinician confirmed ST-segment elevation had the greatest effect on positive predictive value, more than heart rate, QRS duration, or absence of artifact. V13 of the algorithm (Table 3) only used ST-segment elevation to augment to the software interpretation and had a positive likelihood ratio of 150 (95% CI, 128-174). A total of 536 cases would meet this version of the criteria (44.9% overall). If there was a desire to simplify the algorithm to use only a single criterion in addition to the software interpretation, verification of ST-segment elevation would be the best candidate. Training paramedics to identify and apply these criteria would differ from traditional ECG training in that the sole purpose is to identify whether a case is a true or false positive without trying to identify the underlying nature (eg, left bundle branch block). This simplification may reduce the training time and costs associated with paramedic identification of STEMI.
The most frequent criteria not met were heart rate and QRS duration, both of which have been implicated as common causes for prehospital false-positive software interpretation of STEMI. Reference Sanko, Eckstein and Bosson22,Reference Kado, Wilson, Strom and Box25,Reference Bosson, Sanko and Stickney30,Reference Coffey, Serra, Goebel, Espinoza, Castillo and Dunford32 Artifact was also a frequent cause of failure, although agreement on which ECGs contained artifact was poor. While the raters were shown examples of high- and low-quality ECGs during their standardized training, there may be a need for more objective criteria of poor ECG quality. The presence of artifact had a similar magnitude in affecting the test characteristics as other criteria, but poor interrater reliability may reduce its utility overall. There are many challenges to acquiring high-quality ECGs in the prehospital setting, but additional training on ECG acquisition skills may improve data quality and obviate the need to worry about artifact contributing to software misinterpretation. Repeating the ECG may also have a role, as previous literature has shown that repeat prehospital ECGs increase the identification of STEMI. Reference Verbeek, Ryan, Turner and Craig43,Reference Tanguay, Lebon, Lau, Hébert and Bégin44 It is unclear if this effect is from the evolution of the infarct or acquiring a higher quality ECG.
While these data demonstrate relatively favorable test characteristics for the software interpretation of STEMI alone, systems that solely rely on software interpretation experience false CCL activations ranging from 10%-50%, likely owing to the broad use of screening ECGs. Reference Sanko, Eckstein and Bosson22–Reference Bhalla, Mencl, Gist, Wilber and Zalewski26 The criteria can be used to identify a subgroup of true STEMI patients with a very high degree of certainty without the need for ECG transmission or physician over-read, thus allowing for EMS routing and resource mobilization as appropriate for a given system, while the remainder of patients would default to the current standard of care for cases with high suspicion of STEMI. This approach may appeal to EMS systems where mobile data connections used for ECG transmission are problematic, or in systems that cannot afford the additional equipment needed for ECG transmission. While hospitals will ultimately need to verify the prehospital ECG upon arrival, applying the algorithm would allow for more accurate activation of the CCL, reduce D2B time through parallel processing, and preserve resources through reduction of ECG overcalls. This could also be valuable for systems that currently rely on transmission for physician interpretation by allowing physicians to focus on the more challenging cases that fail the algorithmic criteria and require further interpretation. This would have the added benefit of reducing alarm fatigue and interruptions in a specialty where disruptions to clinical workflow already occur at a startlingly high frequency. Reference Westbrook, Raban, Walter and Douglas15–Reference Chisholm, Dornfeld, Nelson and Cordell18 The benefit of implementing any of these systems will vary dramatically based on facility and regional characteristics.
Limitations
These results must be considered in the context of multiple limitations. Given the retrospective data, it is subject to documentation errors and confounding. This study used Physio-Control LIFEPAK 15 devices with the University of Glasgow ECG analysis program (Version 27). While the dataset is over 10 years old, the current model of LIFEPAK being sold (LIFEPAK 15 V4+) still uses the same University of Glasgow Version 27 software algorithm for ECG interpretation as the devices in the dataset. These results may not be generalizable to other brands of monitor, despite research implicating similar sources of false positives between software algorithms. Reference Tanaka, Matsuo and Kikuchi19,Reference Sanko, Eckstein and Bosson22–Reference Bhalla, Mencl, Gist, Wilber and Zalewski26 The reported differences in baseline sensitivity and specificity would likely affect the performance of the algorithm when applied to devices using different software interpretation. New software interpretation that utilizes machine learning may have entirely different sources of errors or be so accurate as to obviate the incremental gains from an approach such as this. Reference Chen, Wang and Liu45,Reference Forberg, Khoshnood and Green46
Previous studies have used a variety of gold standards for true STEMI, including physician consensus of ECG findings, disposition to CCL, cardiac biomarkers, and CCL outcomes, which makes comparison difficult. The authors used appropriate CCL activation, determined based on a number of outcomes as well as cardiologist consensus. This was chosen because the primary goal for EMS in the field is to determine which patients need routing for emergent PCI. While the intent for this algorithm is to eventually be applied by clinicians in the field, in this study, artifact and presence of ST-segment elevation were evaluated by paramedics, medical students, and residents in a controlled research setting. Given the pressures involved while providing prehospital care, it is unknown how accurately prehospital clinicians can utilize these criteria to classify cases using this algorithm. Further, the interrater reliability for artifact was poor, which could affect the application of these criteria. Finally, this study was conducted at a single urban agency, and as such, the results may not be generalizable to all systems, particularly rural areas.
Conclusions
A simple set of four criteria (heart rate <130, QRS <100, verification of ST-segment elevation, and absence of artifact) applied in addition to the software ECG interpretation can identify cases with a high probability of being a true STEMI. Activation of the CCL in these cases by paramedics, without physician over-read, would reduce the need for ECG transmission and physician interpretation and may reduce D2B time. Future research should evaluate the effects on prospective, real-time application of these criteria and consider patient-oriented outcomes.
Conflicts of interest/funding
This work was supported by the National Center for Advancing Translational Sciences, National Institutes of Health under grant TL1TR002546. The authors have no other conflicts of interest to declare.
Acknowledgements
This work was supported by the National Center for Advancing Translational Sciences, National Institutes of Health under grant TL1TR002546.
The authors wish to thank and acknowledge the following individuals for their important contributions to this project: Sara Stadnicki, Jack Pietrykowski, Alex Maloof, Matthew C. Sterling, David Owen, William J French, James G. Jollis, Michael C. Kontos, Tyson G. Tyler, Ronald E. Stickney, and Richard Tadeo.
Supplementary Materials
To view supplementary material for this article, please visit https://doi.org/10.1017/S1049023X23006635