Introduction
Spatial neglect is a common result of unilateral, predominantly right-hemispheric brain damage (Becker & Karnath, Reference Becker and Karnath2007). Its core symptoms include an egocentric bias in gaze direction and exploration towards the ipsilesional side (Corbetta & Shulman, Reference Corbetta and Shulman2011; Karnath & Rorden, Reference Karnath and Rorden2012). One type of test to detect and quantify these symptoms are cancelation tasks (Weintraub & Mesulam, Reference Weintraub, Mesulam and Mesulam1985; Gauthier, Dehaut & Joanette, Reference Gauthier, Dehaut and Joanette1989; Ferber & Karnath, Reference Ferber and Karnath2001). They are commonly presented on sheets of paper placed in front of the patient, who is required to find and manually mark all targets among distractors. Patients with spatial neglect often miss targets on the contralesional side. The presence and severity of spatial neglect can be measured by the Center of Cancelation (CoC, Rorden & Karnath, Reference Rorden and Karnath2010) which assesses the average position of correctly marked targets with respect to the patient’s ego center.
While paper-and-pencil-based cancelation tasks can be a time-efficient yet reliable diagnostic alternative to more extensive test batteries (Fullerton, Stout, & McSherry, Reference Fullerton, Stout and McSherry1986; Ferber & Karnath, Reference Ferber and Karnath2001), they provide only part of the information they could if they were computer-based (Schendel & Robertson, Reference Schendel and Robertson2002; Bonato & Deouell, Reference Bonato and Deouell2013, Dalmaijer, Van der Stigchel, Nijboer, Cornelissen, & Husain, Reference Dalmaijer, Van der Stigchel, Nijboer, Cornelissen and Husain2015). Among other aspects, digitization can provide additional variables such as response time, revisits (of already marked items), and information concerning the search path applied (Donnelly et al., Reference Donnelly, Guest, Fairhurst, Potter, Deighton and Patel1999; Dalmaijer et al., Reference Dalmaijer, Van der Stigchel, Nijboer, Cornelissen and Husain2015).
However, due to the lack of comparison with traditional, validated paper-and-pencil versions, it cannot yet be excluded that variations in test format may lead to results that differ from those of traditional paper-and-pencil versions. Furthermore, in clinical practice, traditional A4 paper-and-pencil tests will likely be implemented as scaled-down versions matching commonly used tablet sizes. However, the effect of using devices of different sizes on the validity of the tests has not yet been sufficiently studied in the context of cancelation tasks. Concerning line bisection, another means used to diagnose neglect, previous observations have suggested that the length of the bisected line may have some influence on spatial attentional processing (Bowers & Heilman, Reference Bowers and Heilman1980; McCourt & Jewell, Reference McCourt and Jewell1999; Anderson, Reference Anderson1997). On the other hand, studies in neurological patients have suggested that a change in frame size, that is the size of the space searched by the patient, does not necessarily affect neglect-specific impairments. Body-centered (egocentric) and object-centered (allocentric) neglect appeared to dynamically adapt to different frame sizes, (Karnath & Niemeier, Reference Karnath and Niemeier2002; Baylis, Baylis, & Gore, Reference Baylis, Baylis and Gore2004; Karnath, Mandler & Clavagnier, Reference Karnath, Mandler and Clavagnier2011; Li, Karnath & Rorden, Reference Li, Karnath and Rorden2014).
In the present study, we compared right hemisphere stroke patients’ performance in cancelation tasks across different formats (paper-and-pencil vs. digital) and display sizes (small, medium, large) to investigate whether digitization of traditional cancelation tasks to various screen sizes affects their validity. As new variables become available through digitization, a further objective was to evaluate their contribution to diagnostic decisions. This is important because in clinical practice patients not always can complete a cancelation task (e.g., because they are too exhausted or because testing must be interrupted due to other clinical necessities). While measuring the CoC requires running the test to completion, other behavioral variables might become extractable already early on and thus aid diagnosis (if a test cannot be completed), given that these parameters proved to detect neglect-specific behavior. Based on previous observations on neglect patients’ visual coordination (Karnath & Huber, Reference Karnath and Huber1992; Donnelly et al., Reference Donnelly, Guest, Fairhurst, Potter, Deighton and Patel1999; Ptak, Golay, Müri, & Schnider, Reference Ptak, Golay, Müri and Schnider2009; Machner et al., Reference Machner, Dorr, Sprenger, von der Gablentz, Heide, Barth and Helmchen2012; Kaufmann et al., Reference Kaufmann, Cazzoli, Pflugshaupt, Bohlhalter, Vanbellingen, Müri, Nef and Nyffeler2020) we investigated parameters search speed (number of targets found relative to time), search distance (the mean distance between two consecutive targets), and search strategy (a calculation of search path) for their ability to predict spatial neglect.
Methodological investigations have shown that effects revealed by statistical analyses often have limited informative value in (applied) diagnostic contexts, even when effect sizes are very large (Dwyer, Falkai & Koutsouleris, Reference Dwyer, Falkai and Koutsouleris2018). Due to their strong focus on generalization and prediction of unknown data, machine learning approaches appear to be more suitable in most diagnostic applications than most statistical modeling approaches (Dwyer, Falkai & Koutsouleris, Reference Dwyer, Falkai and Koutsouleris2018). The specific use of machine learning models in diagnostic processes can vary, ranging from automatic evaluations of diagnostic tasks (Chen et al., Reference Chen, Stromer, Alabdalrahim, Schwab, Weih and Maier2020) to interpretable classifications that outperform traditional paper-based tests in the prediction of neuropsychiatric disorders (Souillard-Mandar et al., Reference Souillard-Mandar, Penney, Schaible, Pascual-Leone, Au and Davis2021). Accordingly, in the present investigation, we tested the potential diagnostic value of process parameters obtained from digital cancelation tests using such approaches.
Method
Subjects
Nineteen continuously admitted acute right hemisphere stroke patients (N = 8 without spatial neglect; N = 11 suffering from spatial neglect) and one chronic neglect patient who returned for a follow-up neuropsychological investigation were recruited at the Centre of Neurology at Tuebingen University. Structural imaging was acquired by computed tomography as part of the clinical routine conducted for all stroke patients at admission except for one patient who received magnetic resonance imaging instead. Patients with diffuse or bilateral brain lesions, patients with tumors, and patients without obvious lesions were not included. According to the routine clinical neurological examination, patients did not suffer from any further neurological pathologies. Clinical and demographic variables of the two patient groups are summarized in Table 1; Figure 1 illustrates an overlap plot of their brain lesions. The study was performed in accordance with the revised Declaration of Helsinki, the local ethics committee approved the study and all patients provided their written consent to participate.
* derived from initial diagnostics;
** pooled from digital versions (data was not evident from paper-and-pencil versions).
All patients were clinically examined with a bedside neglect screening upon admission to the Centre of Neurology. This screening determined patients’ allocation to the neglect group or the control group. The 19 acute stroke patients were tested on average 6.4 days (SD = 4.5) post-stroke; the chronic neglect patient was tested 32 months post-stroke. The screening included two cancelation tasks (Bells test [Gauthier et al., Reference Gauthier, Dehaut and Joanette1989]; Letters test [Weintraub & Mesulam, Reference Weintraub, Mesulam and Mesulam1985]), and a copying task (Johannsen & Karnath, Reference Johannsen and Karnath2004). These tasks were presented on a DIN A4 sized 297 by 210 mm paper each. We calculated the CoC using the procedure and cut-off scores for neglect diagnosis by Rorden and Karnath (Reference Rorden and Karnath2010) for both the Letters (cut-off: −/+ 0.083) and Bells test (cut-off: −/+ 0.081). The CoC is a sensitive measure capturing both number and location of omissions, with zero representing an equal distribution of correctly identified stimuli along the x-axis of the test sheet. Negative deviations (with a maximum of −1) indicate a bias to the left side of the test sheet. Positive deviations (with a maximum of 1) indicate a bias to the right side of the test sheet. The copying task requires patients to copy a complex multi-object scene consisting of four figures (a fence, a car, a house, and a tree), with two of them located in each half of the horizontally oriented sheet of paper. Omission of at least one of the contralesional features of each figure was scored as 1, and omission of each whole figure was scored as 2. One additional point was given when contralesionally located figures were drawn on the ipsilesional side of the paper sheet. The maximum score was 8. A score higher than 1 (i.e., > 12.5% omissions) was taken to indicate neglect. The duration of each test depended on the patient being satisfied with his/her performance and confirming this twice. Spatial neglect was diagnosed if patients scored within the pathological ranges of at least 2 out of 3 tests (see. Tab. 1).
Material and procedure
The experiment included the same cancelation tasks as the clinical assessment, that is, the Bells and Letters test, presented on A4 sheets of paper. Beyond, the experiment comprised computerized touch screen versions of said cancelation tasks. Computerized testing was performed on a capacitive 27-inch multi-touch display (3M – M2767PW), connected to a laptop (HP ProBook 4740s with Windows 7 Professional). The touchscreen versions of the two tasks were custom created using MATLAB 2016b and Psychtoolbox (https://doi.org/10.17632/6dzxs69j7d.1). Computerized cancelation tests (touch screen – TS) were high-resolution versions of the original test images used for the paper-and-pencil version displayed in three different sizes: 260.28 mm × 173.52 mm (“TS small”; a tablet size as e.g., in Microsoft surface, HP Elite, Dell Latitude 5290), 297 mm × 210 mm (“TS medium”; equivalent to an A4 paper), and 597.6 mm × 336.2 mm (“TS large”; full-screen size of the 27-inch touch screen). The small and medium versions were displayed centrally on the 27-inch display, with a black margin between the end of the test and the end of the screen. Despite the different sizes in the respective conditions test coordinates were always measured with a relative distance from center to borders between −1 and 1, −1/−1 representing the upper left corner. To keep paper-and-pencil and touchscreen conditions as comparable as possible, the touchscreen lay flat on the table and a touchscreen compatible pen (Adonit Dash 2) was used to mark the targets. Patients’ marks were visualized in real-time, providing patients with visual feedback comparable to that provided by conventional pens on a regular sheet of paper. Due to their health issues, four patients were unable to complete all trials, which led to 9 missing data sets in different test conditions. Said patients had to be excluded from parts of the analyses.
In the experiment, half of the participants started with the paper-and-pencil version of the two cancelation tasks, the other half with the touchpad versions. The order of the two paper-and-pencil versions was alternated, the order of the 6 different touchpad versions was randomized. Participants were instructed to find all the bells/”A”s that were spread among distractors and to tell the experimenter once they were done. Before starting the next trial, patients confirmed that they were indeed done with this trial, that is, could not find any other target stimuli.
Data analysis
For comparing right hemisphere stroke patients’ CoC performance in cancelation tasks across different formats (paper-and-pencil vs. digital) and display sizes (small, medium, large), we used Wilcoxon and Friedman tests respectively. To measure (1) search speed, we extracted a participant’s total number of correctly identified items and divided it by the time measured between starting the test and marking the last item to assess the number of targets found relative to time (measured in seconds). For (2) search distance we averaged the Euclidean distance between every two targets found in direct succession to each other. While search distance was defined as Euclidean distance, a high degree of (3) search strategy, was defined by a pattern that keeps either the steps along the (assumed) x- or y-axis low and subsequently results in a row (a low distance on the y-axis) or column-wise (a low distance on the x-axis) search (for an illustration of the distinction see Figure 2). Both distances were averaged across all found targets. A strategic search, as we define it here, should result in low values in either the mean x-axis distance or the mean y-axis distance. Low y values indicate a row-wise left-to-right (reading-like; Figure 3A) or alternating left-to-right and right-to-left (Figure 3B) search pattern; low x values indicate a column-wise top-to-bottom (Figure 3C) or alternating top-to-bottom and bottom-to-top (Figure 3D) search pattern. The measure is independent of direction and applies also if tests were started from the right or the bottom. To investigate potential differences between (i) the digital screen sizes and (ii) right-hemispheric patients with spatial neglect in comparison to patients without neglect, we applied a 2 × 3 analyses of variance for each of the three parameters above (i.e., search speed, search distance, and search strategy) with the between-subjects factor group (neglect vs. no neglect) and the within-subjects factor screen size (TS small vs. TS medium vs. TS large).
To finally analyze if the three parameters above can be used to reliably predict participants’ neglect diagnosis (dichotomized: spatial neglect vs. no neglect), we used Support Vector Machines (SVM). Given that SVM require complete data sets, we first used Multiple Imputation by Chained Equations (MICE; White, Royston, & Wood, Reference White, Royston and Wood2011) to impute missing data for this analysis step only. It entailed that missing values in a given column were estimated using a Bayesian Ridge Regressor, predicting values of the current column from all other columns. MICE was carried out column-wise from the column with the least number of missing values to the column with the most missing values. The potential impact of the imputation was tested by rerunning all analyses with a dataset where missing values were omitted. In the following sections, only results for the imputed dataset will be reported, because the pattern of results remained identical with and without imputation. Due to the sample size of the present study, we decided to use a dataset containing all screen sizes (TS small, TS medium, TS large) for each participant. To account for the dependence of data points in this approach (i.e., three measures for each participant), we tested our models through Leave-One-Subject-Out Cross-Validation. In this procedure, the machine learning model is trained on data for all participants but one and tested on the participant that was left out for training. This process is then repeated until each participant was predicted once and prediction outcomes (i.e., balanced accuracy due to the unequal group sizes; Brodersen, Ong, Stephan & Buhmann, Reference Brodersen, Ong, Stephan and Buhmann2010) are averaged across predictions for all participants. Hyperparameters (i.e., the kernel: linear or radial basis function; cost parameter: ranging from 0.01 to 10) were optimized through a grid search in a nested Leave-One-Subject-Out Cross-Validation (within the training dataset). This procedure was carried out separately for each task (Bells and Letters test) and balanced accuracy scores were obtained across all screen sizes for both tasks. Lastly, the percent of correctly classified neglect and right-hemispheric control patients were accumulated for each screen size and test. To test if the classification accuracy varied by screen size, chi-square tests of independence were used to compare the distribution of correctly classified neglect and right-hemispheric control patients across screen sizes for each task (Bells and Letters test). All machine learning analyses were conducted in Python using the scikit learn module (Pedregosa et al., Reference Pedregosa, Varoquaux, Gramfort, Michel, Thirion, Grisel, Blondel, Prettenhofer, Weiss, Dubourg, Vanderplas, Passos, Cournapeau, Brucher, Perrot, Duchesnay and Louppe2011).
Results
Comparison between paper-and-pencil and digital formats
To investigate whether digital versus paper-and-pencil test format has an impact on patients’ performance in cancelation tasks, we compared patients’ mean CoC scores in the A4 paper-and-pencil version to those in the same size in the digital A4 touch screen version (TS medium). Data are illustrated in Figure 4. Wilcoxon tests indicated no significant median CoC differences between the digital and the paper-and-pencil versions, neither in the Letters test (Z = 1.784, p = 0.072) nor in the Bells test (Z = 0.533, p = 0.594). In clinical practice, the traditional A4 paper-and-pencil tests will most likely be implemented as a downscaled version to match the currently used tablet size. Thus, we also investigated (cf. Figure 4) whether differences in performance arise between the established A4 paper-and-pencil version and the digital downsized tablet size (TS small). Again, we did not find significant differences for neither the Letters test (Z = 1.784, p = 0.074) nor the Bells test (Z = −0.356, p = 0.722).
Comparison between different sizes of the digital format
Center of cancelation
To investigate whether size variation between the digital versions affects cancelation performances, we used the CoC as dependent variable and test size (TS small vs. TS medium vs. TS large) as independent variable. Data are illustrated in Figure 5. Friedman tests revealed no significant results for the Letters test (χ2 F(2) = 1.750, p = 0.417). The Bells test (χ2 F(2) = 6.00, p = 0.050) was right at the border to significance. We, therefore, applied post hoc Wilcoxon comparisons to rule out significant differences. Indeed all three were found to be non-significant.
Additional parameters of search behavior
Beyond the CoC, the additional variables, search speed, search distance, and search strategy were obtained from the digitized cancelation tasks.
Search speed
Data are illustrated in Figure 6. Analysis of the Bells test revealed a significant main effect of group (F(1,15) = 6.719, p = 0.02, η p 2 = 0.309), indicating that control patients found significantly more targets per second (M = 0.196, SD = 0.057) than neglect patients (M = 0.124, SD = 0.066). The main effect of screen size, on the other hand, was not significant (F(2,30) = 0.235, p = 0.792), indicating that a comparable number of targets was found per second in all three screen sizes. The interaction was not significant either (F(2,30) = 1.983, p = 0.155). The same analysis applied on the Letters test also revealed a significant main effect of group (F(1,13) = 8.624, p = 0.012 η p 2 = 0.399), indicating again that control patients on average found more targets per second (M = 0.278, SD = 0.090) than neglect patients (M = 0.140, SD = 0.091). The main effect of screen size was significant as well (F(2,26) = 5.219, p = 0.012); the interaction was not significant (F(2,26) = 0.176, p = .839). According to post hoc comparisons (Fisher’s Least Significant Difference), more targets per second were found in condition TS large (M = 0.235, SD = 0.120) than in condition TS small (M = 0.183, SD = 0.066, p < 0.05, d = 0.537).
Search distance
Data are illustrated in Figure 6. Analysis of the Bells test revealed no main effect of screen size (F(2,30) = 0.250, p = 0.781) and interaction (F(2,30) = 3.109, p = 0.059), but a main effect of group (F(1,15) = 7.357, p = 0.016, η p 2 = 0.329). Apparently search distance was smaller for control patients (M = 0.550, SD = 0.093) than for neglect patients (M = 0.674, SD = 0.093). For the Letters test, there was both a main effect of group (F(1,13) = 5.486, p = 0.036, η p 2 = 0.292) and of screen size (F(2,26) = 5.528, p = 0.010). The interaction was not significant (F(2,26) = 0.244, p = 0.785). Again, search distance was smaller for control patients (M = 0.452, SD = 0.116) than for neglect patients (M = 0.594, SD = 0.115). Post hoc comparisons indicated that in the TS large version (M = 0.474, SD = 0.120) items found in direct succession were on average closer to one another than in the TS small (M = 0.552 SD = 0.143, p = 0.026, d = 0.590) and TS medium (M = 0.543, SD = 0.127, p = 0.018, d = 0.557) versions.
Search strategy
Data are illustrated in Figure 6. Analysis of this parameter for the Bells test did neither show a main effect of screen size (F(2,30) = 2.503, p = 0.099) nor an interaction (F(2,30) = 0.002, p = 0.998), while the main effect of group was significant F(1,22) = 9.11, p < 0.01, η p 2 = 0.502). Control patients scored significantly higher (M = 0.472, SD = 0.119) than neglect patients (M = 0.251, SD = 0.117), indicating that search behavior of neglect patients was more strategic than the one of control patients. Results concerning the Letters test uncovered a main effect for group (F(1,13) = 15.787, p = .002 η p 2 = 0.548), indicating that neglect patients search behavior was significantly more strategic (M = 0.180, SD = 0.51) than the one of right-hemispheric control patients (M = 0.386, SD = 0.101). There was neither a main effect of screen size (F(2,26) = 0.513, p = 0.604) nor an interaction (F(2,26) = 0.1.177, p = 0.324).
Prediction of spatial neglect by the additional parameters of search behavior
To determine if the three additional parameters of search behavior provided by the digital format can be used to differentiate between right-hemispheric patients with and without spatial neglect, SVM were used. First, the binary diagnosis (neglect vs. no neglect) was predicted separately for the Bells and the Letters tests across all screen sizes, using Leave-One-Subject-Out Cross-Validation. Results showed that this cross-participant classification across screen sizes was highly accurate for the Bells test and for the Letters test with average balanced accuracy scores of 97.92% and 88.19%, respectively. The training and test accuracies for all models are shown in Figure 7. Second, chi-square tests of independence indicated that the frequency of accurately predicted neglect and right-hemispheric control patients (see Table 2) was independent of the screen size for the Bells (χ2(2) = 0.05, p = 0.973) and Letters test (χ2(2) = 0.18, p = 0.914). To investigate if the machine learning models solely predict neglect diagnosis as a proxy for lesion size, we tested if the models could accurately differentiate if a participant had an above or below average lesion volume compared to the sample. Results showed low accuracy for models trained on both tests (Bells test 64.29%, Letter test (54.76%) indicating that predictions were largely made independent of lesion volumes.
Discussion
Paper-and-pencil versus digital test version
Several papers have acknowledged numerous perks of digitizing neuropsychological assessments in general (Bauer et al., Reference Bauer, Iverson, Cernich, Binder, Ruff and Naugle2012; Germine, Reinecke, & Chaytor, Reference Germine, Reinecke and Chaytor2019) and neglect diagnostics specifically (Donnelly et al., Reference Donnelly, Guest, Fairhurst, Potter, Deighton and Patel1999; Bonato, Priftis, Marenzi, Umiltà, & Zorzi, Reference Bonato, Priftis, Marenzi, Umiltà and Zorzi2012; Bonato & Deouell, Reference Bonato and Deouell2013). This has inspired the introduction of novel computer-based neglect assessments (Donnelly et al., Reference Donnelly, Guest, Fairhurst, Potter, Deighton and Patel1999; Deouell, Sacher & Soroker, Reference Deouell, Sacher and Soroker2005; Bar-Haim, Kizony, Shahar, & Katz, Reference Bar-Haim Erez, Kizony, Shahar and Katz2006; List et al., Reference List, Brooks, Esterman, Flevaris, Landau, Bowman, Stanton, Vanvleet, Robertson and Schendel2008; Bonato, Priftis, Umiltà, & Zorzi, Reference Bonato, Priftis, Umiltà and Zorzi2013; Dalmaijer et al., Reference Dalmaijer, Van der Stigchel, Nijboer, Cornelissen and Husain2015; Villarreal et al., Reference Villarreal, Linnavuo, Sepponen, Vuori, Jokinen and Hietanen2020). Digital versions have been argued to be more flexible, allowing to create several parallel versions of a specific task and therefore preventing learning effects from numerous repetitions of one identical version, for example, in the course of rehabilitation (Bonato & Deouell, Reference Bonato and Deouell2013). They can further be created to be immediately adaptive to patients’ individual performance (List et al., Reference List, Brooks, Esterman, Flevaris, Landau, Bowman, Stanton, Vanvleet, Robertson and Schendel2008). Moreover, digital formats could further increase a test’s sensitivity by increasing the amount of information extractable from its data (Bonato & Deouell, Reference Bonato and Deouell2013, Dalmaijer et al., Reference Dalmaijer, Van der Stigchel, Nijboer, Cornelissen and Husain2015). However, previous studies also stressed the importance of validating digital formats (Bauer et al., Reference Bauer, Iverson, Cernich, Binder, Ruff and Naugle2012; Germine et al., Reference Germine, Reinecke and Chaytor2019). The present paper is to our knowledge the first that systematically compared patients’ performances between digital and analogous formats. Stroke patients’ CoC derived from cancelation tasks seems robust to test digitization. Thus, it seems safe to introduce digitized diagnostic measures (at least in the scope of size variations as investigated in the present study) and keep the existing cut-off scores, without having to fear distortions in the CoC and related diagnostic decisions.
Test/display size of the cancelation tasks
Center of cancelation
Patients’ CoCs did not seem to be impacted by test size either. Our analysis revealed that neglect patients ignored a comparable ratio of contralesional target stimuli, regardless of test size. This observation corresponds to previous findings on reference frames suggesting a dynamic view of the neglected area in space, depending on the respective behavioral goal of the subject. Karnath and Niemeier (Reference Karnath and Niemeier2002) argued that the brain continuously organizes and re-organizes the representation of the same physical input according to the changing task requirements. The authors showed that whether or not neglect patients ignored certain spaces in a visual search task did not depend on the frame size itself, but rather on the relative location within the part of space they were asked to pay attention to. Patients were found to ignore the left half of space when asked to explore only that very segment but attended to it fully when it constituted the right half of a larger segment. Similar results were found by Baylis and colleagues (Reference Baylis, Baylis and Gore2004). The observation that removing targets once they are identified by patients reduces patient’s attention frame and thus manages to draw patients’ attention further into contralesional space (Mark et al., Reference Mark, Kooistra and Heilman1988; Keller, Volkening & Garbacenkaite, Reference Keller, Volkening and Garbacenkaite2015) further supports this notion. In conclusion, these findings indicate that the neglect-specific egocentric bias seems to be robust to variations in screen size and provides a suitable explanation for the CoC’s indifference to size changes observed in the present study. While our results based on a sample size of 12 continuously admitted neglect patients represent an initial estimate, further evidence based on larger samples is needed to endorse our findings.
Additional parameters of search behavior
In the additional digital parameters, neglect patients showed decreased search speed, increased search distance, and a more strategic search pattern than right-hemispheric control patients without neglect in both cancelation tasks.
Search speed and search distance
Deouell and colleagues (Reference Deouell, Sacher and Soroker2005) had already shown that reaction times measured in a dynamic search task appear more sensitive than a common attention battery in illustrating neglect deficits and their recovery. Our finding that processing time of search behavior is also impaired is in agreement with these observations. Neglect patients are frequently observed to start working on tasks from the right side and effortfully drag their attention towards the contralesional hemispace. An eye-tracking study on reading behavior, for example, illustrated how straining it is for a neglect patient to advance further towards the neglected left. While healthy readers find the beginning of the next text line by performing long, pointed saccades, the investigated neglect patient moved leftward gradually (Karnath & Huber, Reference Karnath and Huber1992). Of course, the latter is much more time-consuming. Studies investigating the visual scanning and exploration behavior of neglect patients on photographs (Ptak, et al., Reference Ptak, Golay, Müri and Schnider2009; Machner et al., Reference Machner, Dorr, Sprenger, von der Gablentz, Heide, Barth and Helmchen2012; Kaufmann et al., Reference Kaufmann, Cazzoli, Pflugshaupt, Bohlhalter, Vanbellingen, Müri, Nef and Nyffeler2020) and videos (Machner et al., Reference Machner, Dorr, Sprenger, von der Gablentz, Heide, Barth and Helmchen2012) found that increased salience due to motion (Ptak et al., Reference Ptak, Golay, Müri and Schnider2009) or contrasts (Machner et al., Reference Machner, Dorr, Sprenger, von der Gablentz, Heide, Barth and Helmchen2012) can help a patient attend to the neglected hemispace. Kaufmann and colleagues (Reference Kaufmann, Cazzoli, Pflugshaupt, Bohlhalter, Vanbellingen, Müri, Nef and Nyffeler2020) found neglect patients’ perseverance to ipsilesional space under neutral conditions to be so distinct that it proved to be more sensitive in detecting neglect than common diagnostical measures. This exploration pattern might directly translate to our visual search tasks. Neglect patients may be more likely to move leftward inefficiently progressing from one stimulus to the next, coming across a target every now and then rather coincidentally. This process makes them more prone to miss a target if a distractor is closer to the current fixation and attracts patients’ attention instead, resulting in a larger search distance.
Screen size-dependent performance was found in the Letters but not the Bells test, which could be caused by the different complexity of the tests. Neglect according to the Letter cancelation task is diagnosed if more than four contralesional stimuli are omitted, while a diagnosis based on the Bells test requires at least five omissions (Rorden & Karnath, Reference Rorden and Karnath2010). Out of context, that doesn’t seem a grave difference, however, the Letters test contains 60, Bells test only 35 targets hidden among distractors. Relatively speaking, the cut-offs represent 6 % versus 14 % omissions, indicating that healthy individuals are more likely to omit bells than “A”s. Automatized letter recognition is known to be superior to object identification (Denckla & Rudel, Reference Denckla and Rudel1974), which might explain why in the Letters but not the Bells test targets were identified faster in the large screen size than the small one. Since automatized reading depends on how well letters are recognizable, enlarging letters in our paradigm likely improved participants’ perception and search efficiency by facilitating target/distractor discrimination. Bells’ more effortful shape-identification might not benefit as much from a larger depiction. Patients searched significantly faster only between the small and the large variant of the Letters test. The search was more thorough in the large version compared to the small and medium when normalized for screen size. Since the size difference between medium and large (59 %) is greater than between small and medium (28 %), 59 % magnification seems sufficient to decrease the likelihood of missing closeby targets (thus decreasing search distance), while improving search speed requires a larger increase in test size.
Search strategy
In contrast to the fairly straightforward measures for search speed and search distance, it is rather hard to come up with a universal indicator of search strategy, since a strategic search can be performed in many different ways. The measure we defined as “search strategy” in the present study should cover strategies typically applied by healthy individuals (cf. Figure 2; Warren, Moore, & Vogtle, Reference Warren, Moore and Vogtle2008). Interestingly, we found that neglect patients’ search behavior was more “strategic” (according to our definition) than that of right-hemispheric control patients in both Bells and Letters test. While neglect patients more frequently applied strategies such as those shown in Figure 2, right-hemispheric controls either searched in a less “strategic” manner or applied a strategy different from the ones typically applied by healthy individuals (Warren et al., Reference Warren, Moore and Vogtle2008). While this finding might seem surprising at first glance, previous research provided evidence that neglect patients do not generally exhibit impairments in search strategy (Donnelly et al., Reference Donnelly, Guest, Fairhurst, Potter, Deighton and Patel1999; Mark et al., Reference Mark, Woods, Ball, Roth and Mennenmeier2014; Ten Brink et al., Reference Ten Brink, Visser-Meily and Nijboer2018). Donnelly and colleagues (Reference Donnelly, Guest, Fairhurst, Potter, Deighton and Patel1999) generated 16 different strategic search patterns and investigated which ones were applied by healthy participants as well as by right-hemispheric patients with and without neglect. Although neglect patients tended to apply different search strategies than the majority of healthy participants and non-neglect patients, their pattern matched some of the authors’ predefined strategic search paths. More specifically, neglect patients most frequently applied a strategy that mirrored the most common strategy used by healthy controls. The favored strategy reported by Donelly and colleagues was equivalent to the one illustrated in Figure 2 C. While controls started from the top left corner and worked their way to the right, neglect patients started from the top right corner and proceeded leftward. In accordance with Donelly, our neglect patients started 100% from the right side in the bell test and 94% in the letter test. However, since our measure accounted for strategies starting from both sides (left or right), both directions were considered equally strategic. Our results indicate that neglect patients seem to follow a (potentially predefined) line- or row-wise search pattern, while patients without neglect rather turn to overall close items. Neglect patients’ previously mentioned search efforts might make them more susceptible to applying search strategies, potentially to compensate for their lacking overall search efficiency which might impair their detection of targets in the close proximity of the last hit.
With regards to the diagnostic use of the three process measures (i.e., search speed, search distance, and search strategy), machine learning classifications indicated that these variables can be used to differentiate neglect patients from right-hemispheric controls reliably, using parsimonious modeling approaches. Particularly for the Letters test, the overall small differences between training and test accuracy for these cross-participant classifications (see Figure 7) indicate that the models generalize well, which is crucial for potential applications of such measures. For instance, digital tests could capture these measures in real-time and predict the diagnosis already at the early stages of the testing procedure, which would be beneficial if, for example, the test has to be aborted. This, in turn, could serve as a basis for adaptive and time-saving diagnostic procedures that are less strenuous for patients. While overall predictions were highly accurate in both tests (97.92% for the Letters test and 88.19% for the bells cancelation test), it is important to note that predictions for the Letters test predictions were less accurate and showed a larger variation between training and test samples than for the bells cancelation task. Here, future research with larger patient samples is needed to further evaluate such differences in the diagnostic value of process measures between tests. With regards to screen size, our analyses showed that model predictions for both tests were independent of screen size. For further studies and potential practical applications, this indicates that additional measures obtained from digitized tests can be used to reliably classify neglect regardless of screen size. Nonetheless, future research with larger sample sizes is required to confirm the robustness of our models.
Conclusion
The present results allow an optimistic outlook on the digitization of cancelation tasks. Changes in test format (paper-and-pencil vs. digital) and in screen size do not seem to bias patients’ CoC measure, which often serves for diagnostic decision-making in spatial neglect. This robustness opens the possibility to optimize some visual parameters for more efficient testing. Increasing the stimulus size in the Letters test seems to help patients identify targets more quickly, which would make diagnosis less time-consuming for the examiner and less exhausting for the patients. Machine learning methods indicated that new search parameters derived from computerized tests could help differentiate neglect and non-neglect patients. The latter is an interesting new perspective because in clinical practice it is often not possible to perform cancelation tasks to the end (which is mandatory to calculate the CoC measure). If neglect-specific performance features can be extracted from variables such as search speed, search distance, and search strategy neglect diagnosis might become possible even if the test is discontinued. Future studies are needed to investigate the latter.
Acknowledgements
Funding statement
This work was supported by the Deutsche Forschungsgemeinschaft (KA 1258/23-1).
Conflicts of Interest
None