INTRODUCTION
Radiographs are a key diagnostic tool used by the emergency physician (EP). During their emergency department (ED) shift, EPs interpret radiographs to aid or confirm a diagnosis that affects both treatment plan and disposition of the patient. In most EDs, radiographs are subsequently interpreted by a radiologist, who issues an official report. Most EDs use a quality assurance (QA) system that ensures that discrepancies between the initial ED and final radiologist interpretations are addressed and changes in management made as required. Despite this retrospective system, the accuracy of the initial, real-time radiograph interpretation is important, given the risk of errors in the ED that could have consequences for morbidity and mortality.
Studies in pediatric EDs have examined the accuracy of EPs’ interpretation of radiographs and the impact of discrepancies on patients. Reported discrepancy rates between EP and radiologist readings have ranged between 1% and 28%.Reference Gratton, Salomone and Watson 1 - Reference Minnes, Sutcliffe and Klassen 13 However, clinically significant discrepancies that lead to a change in patient management are relatively uncommon; reported rates ranged from 0% to 9%, according to the studies cited previously. Studies in adult EDs have generally reported lower imaging discrepancy rates than those found in pediatric EDs, ranging from 0.8% to 3.7%.Reference Petinaux, Bhat and Boniface 5 , Reference Kim, Lee and Hong 12 Similarly, rates of clinically significant discrepancies in adult studies are also lower, ranging from 0% to 2.8%.Reference Gratton, Salomone and Watson 1 , Reference Benger and Lyburn 11
The purpose of this study was to examine discrepancies interpreted between EPs and radiologists and determine whether they were clinically significant. Changes in anatomy and ossification centres make pediatric radiographs more challenging to interpret. Earlier studies have either combined adult and pediatric patients, thereby limiting their study to imaging of fewer body parts, or were conducted when there were few trained pediatric EPs.
METHODS
The Hamilton Integrated Research Ethics Board granted approval for this study (REB #14-667-C). It consisted of a retrospective 18-month review of discrepant radiology reports from McMaster Children’s Hospital, a tertiary-care academic pediatric ED, from October 2012 to December 2014. The department has a census of more than 40,000 patient visits a year. The current practice is for all radiographs to be interpreted first by the staff EP and then reviewed by a staff radiologist within 24 hours. If this final impression differs from the EP’s interpretation, the report is flagged as a “discrepancy” according to ED protocol. These films are reviewed as part of a QA process where the EP on duty is given a list of discrepant reports, reconciles the findings with the original patient chart, and provides the family with a new treatment plan, if indicated. For the purposes of comparison, we took the staff radiologist’s report as the gold standard.
This chart review followed the Gilbert criteria.Reference Gilbert, Lowenstein and Koziol-McLain 14 Inclusion criteria consisted of all plain X-rays completed during this period and classified as discrepant. A process was created to save and index all such imaging results for further review. Other imaging modalities (computed tomography, magnetic resonance imaging, ultrasound) and trauma imaging were excluded. If there was a question in the interpretation of the radiograph in real time and the radiologist’s interpretation provided a preliminary report, then the subject was also excluded. Radiographs were categorized as chest, abdomen, axial skeleton, upper extremity, lower extremity, soft-tissue neck, and other. On the basis of the radiologist’s final interpretation, discrepancies were classified as false positive (FP, abnormality noted by the EP but deemed normal by the radiologist), false negative (FN, abnormality missed by the EP), or not a discrepancy (abnormality noted by the EP elsewhere in the patient chart).
All charts were reviewed and data abstracted by one investigator (JT). If the correct abnormality was identified in a false-negative report, the case was marked as a correct diagnosis. Clinically significant errors that required a change in management of the patient were identified on the basis of medical records. Clinically significant was defined as a discrepancy requiring a change in patient management, including a new prescription, a return to the ED, or follow-up in a specialized clinic.
A total of 77 discrepancies and associated charts (26%) were randomly selected and reviewed by a second investigator (SS) to determine inter-rater reliability. Consistency between reviewers was assessed by the Kappa statistic. Data were collected using a standardized abstraction form and documented in Microsoft Excel.
RESULTS
A total of 25,304 plain radiographs were completed during this period. They included 7,939 chest (CXR), 2,914 abdomen (AXR), 6,407 upper extremity (UE), 4,396 lower extremity (LE), 2,336 axial skeleton (AS), 407 soft-tissue neck (STN), and 915 other. Of the 293 discrepancies recorded, 40 proved to be not discrepant. In these cases, the charts were interpreted but not documented on the radiology view box; instead, they were interpreted and managed appropriately based on the chart notes. They were therefore considered non-discrepant. There were 252 (1.00%) true discrepancies, 123 female and 129 male. The average age was 7.4 years (range 3 days – 17.6 years, SD 5.4 years). Table 1 summarizes patient demographics based on body image. Of the discrepancies, there were 207 false-negative (82.1%) and 45 false-positive interpretations. Table 2 shows a breakdown of the type of X-ray and the category of discrepancy.
SD=standard deviation; yr=year.
FN=false negative; FP=false positive.
The clinically significant error rate for all radiographs completed during the study period was 0.46% (116/25,304). Clinically significant changes included returning patients to the ED (51 patients), filling prescriptions (25), and calling patient or family members to arrange follow-up appointments (38). One patient’s follow-up was not documented. Of the follow-ups done, none resulted in permanent morbidity or any mortality. Table 3 shows the overall discrepancy rate by type of radiograph and clinical significance. CXR was the most frequent study ordered, comprising 31.3% of the total ordered. It also had the highest error rate (0.17%), followed by upper extremity X-rays. A discrepancy occurring in an X-ray of the lower limbs had the highest rate of clinical significance.
The overall discrepancy rate is the number of cases for this body part divided by the total number of discrepant cases. Overall clinical significance is the rate of clinically significant cases divided by the total number of radiographs completed during the study period.
Seventy-seven charts were randomly selected and reviewed for accuracy and consistency. The calculated inter-rater reliability in the charts selected for review was found to be Kappa 0.89 (p<0.01, 95% CI 0.79-0.99). For the abstraction of clinical significance, inter-rater reliability was found to be Kappa 0.76 (p<0.01, 95% CI 0.63-0.89). As described by Altman’s qualitative classification system, these Kappa values represent good to very good inter-rater agreement.Reference Altman 15
DISCUSSION
This retrospective study shows that the rates of discrepancy between EPs’ and radiologists’ interpretations of radiographs in a pediatric ED are quite low. The number and proportion of clinically significant discrepancies are even lower, affecting patient management in only 0.46% of all radiographs ordered. Prior studies, summarized in Table 4, show that EPs do well in interpreting adult X-ray images. However, there has been considerable variability in the interpretation of pediatric X-rays, due perhaps to anatomic changes that vary with age as well as a lack of expertise in interpreting these images. As pediatric emergency medicine has evolved into a recognized sub-speciality, the ability of EPs to interpret X-ray images has improved. Higginson (2004) asks whether, given the low error rate, we still need radiologists to interpret X-rays of pediatric patients in the ED.Reference Higginson, Vogel and Thompson 2
* Included only CXR images.
† Looked at minor trauma patients only. The study is based on the number of patients, not radiographs.
‡ This study looked at extremities only.
Several factors can explain the low error rates found in our study. In contrast to most earlier studies, all radiographs were interpreted and documented by the staff EP, not by trainees or house staff. Our results are therefore not surprising, given that most of the staff reporting these images are either fellowship trainees or have significant experience in pediatric emergency medicine. Another possible factor is the low rate of false positives as a proportion of total discrepancies (45, 15.4%). Radiologists may be less inclined to flag a discrepancy if they consider it a clinically unimportant “overcall” that will not affect patient management, such as an EP’s decision to treat a patient with antibiotics for presumed pneumonia. EPs are likely to prescribe antibiotics if there is clinical concern or suspicion, even in the absence of definitive findings of pneumonia on imaging. Earlier studies of inter-observer reliability among staff radiologists in diagnosing pneumonia on the basis of chest radiographs found a wide range of agreement, with Kappa values between 0.54 and 0.92.Reference Albaum, Hill and Murphy 16 – Reference Moncada, Rueda and Macías 20 This variation in interpretation suggests that with the added benefit of clinical findings, EPs may be more likely to make an accurate diagnosis.
This study was limited to a single-site tertiary care pediatric centre, and all of the cases were flagged and placed in the discrepancy folder by the interpreting EP. Although all of the images and reports may not have been highlighted, reminders were emailed to all staff, and an information sheet was placed in the radiology viewer system. Finally, no discrepancy was found in some cases initially flagged as discrepant. These errors occurred when EPs failed to place their readings in the view box, indicating there are systems issues rather than discrepancies. Further studies are needed to assess the validity of our results as well as generalizability by EPs who work in the community or are not pediatric emergency fellowship trained.
CONCLUSION
In summary, the accuracy of ED staff physicians in interpreting radiographs is high, and the frequency of errors requiring a change in patient management is very low. The majority of errors occurred with radiographs of the chest and upper extremities. The low rate of clinically significant discrepancy allows safe management based on EP interpretation.
Acknowledgements: The abstract for this paper was presented at the European Congress on Emergency Medicine, European Society of Emergency Medicine, Torino, Italy, October 2015. It won the Best Young Scientist Award. JT and RV conceived the study, designed the trial, and obtained research ethics approval. JT, SS, and RV supervised the conduct of the trial and data collection. JT, SS, and RV undertook the recruitment of patients and managed the data, including quality control. JT, SS, and RV provided statistical advice on study design and analysed the data. JT drafted the manuscript, and all authors contributed substantially to its revision. RV takes responsibility for the paper as a whole.
Competing interests: None declared.