Hostname: page-component-f554764f5-fr72s Total loading time: 0 Render date: 2025-04-12T12:05:37.811Z Has data issue: false hasContentIssue false

Design and showcase of a stairs-based testbed for the benchmark of exoskeleton devices: The STEPbySTEP project

Published online by Cambridge University Press:  31 March 2025

Marco Caimmi
Affiliation:
Institute of Intelligent Industrial Technologies and Systems for Advanced Manufacturing (STIIMA) of Italian National Research Council (CNR), Milan, Italy
Nicole Maugliani
Affiliation:
Institute of Intelligent Industrial Technologies and Systems for Advanced Manufacturing (STIIMA) of Italian National Research Council (CNR), Milan, Italy
Matteo Malosio
Affiliation:
Institute of Intelligent Industrial Technologies and Systems for Advanced Manufacturing (STIIMA) of Italian National Research Council (CNR), Milan, Italy
Francesco Airoldi
Affiliation:
Institute of Intelligent Industrial Technologies and Systems for Advanced Manufacturing (STIIMA) of Italian National Research Council (CNR), Milan, Italy
Tito Dinon
Affiliation:
Institute of Intelligent Industrial Technologies and Systems for Advanced Manufacturing (STIIMA) of Italian National Research Council (CNR), Milan, Italy
Diego Borro
Affiliation:
CEIT-Basque Research and Technology Alliance (BRTA) and Tecnun (University of Navarra), Donostia-San Sebastian, Spain Institute of Data Science and Artificial Intelligence (DATAI), University of Navarra, Pamplona, Spain
Martxel Eizaguirre
Affiliation:
CEIT-Basque Research and Technology Alliance (BRTA) and Tecnun (University of Navarra), Donostia-San Sebastian, Spain
Iñaki Díaz
Affiliation:
CEIT-Basque Research and Technology Alliance (BRTA) and Tecnun (University of Navarra), Donostia-San Sebastian, Spain
Sergio Ausejo
Affiliation:
CEIT-Basque Research and Technology Alliance (BRTA) and Tecnun (University of Navarra), Donostia-San Sebastian, Spain
Gabriele Puzzo
Affiliation:
Department of Psychology, University of Bologna, Bologna, Italy
Federico Fraboni
Affiliation:
Department of Psychology, University of Bologna, Bologna, Italy
Luca Pietrantoni
Affiliation:
Department of Psychology, University of Bologna, Bologna, Italy
Marco Maccarini
Affiliation:
Department of Innovative Technologies, University of Applied Science and Arts of Southern Switzerland (SUPSI), Istituto Dalle Molle di studi sull’intelligenza artificiale (IDSIA), Lugano, Switzerland
Asad Ali Shahid
Affiliation:
Department of Innovative Technologies, University of Applied Science and Arts of Southern Switzerland (SUPSI), Istituto Dalle Molle di studi sull’intelligenza artificiale (IDSIA), Lugano, Switzerland
Loris Roveda*
Affiliation:
Department of Innovative Technologies, University of Applied Science and Arts of Southern Switzerland (SUPSI), Istituto Dalle Molle di studi sull’intelligenza artificiale (IDSIA), Lugano, Switzerland Politecnico di Milano, Dipartimento di Meccanica, Milano, Italy
*
Corresponding author: Loris Roveda; Email: [email protected]

Abstract

Wearable exoskeletons hold the potential to provide valuable physical assistance across a range of tasks, with applications steadily expanding across different scenarios. However, the lack of universally accepted testbeds and standardized protocols limits the systematic benchmarking of these devices. In response, the STEPbySTEP project, funded within the Eurobench framework, proposes a modular, sensorized, reconfigurable staircase testbed designed as a novel evaluation approach within the first European benchmarking infrastructure for robotics. This testbed, to be incorporated into the Eurobench testing facility, focuses on stairs as common yet challenging obstacles in daily life that provide a unique benchmark for exoskeleton assessment.

The primary aim of STEPbySTEP is to propose a modular framework – including a specialized staircase design, tentative metrics, and testing protocols – to aid in evaluating and comparing exoskeleton performance. Here, we present the testbed and protocols developed and validated in preliminary trials using three exoskeletons: two lower-limb exoskeletons (LLEs) and one back-support exoskeleton. The results offer initial insights into the adaptability of the staircase testbed across devices, showcasing example metrics and protocols that underscore its benchmarking potential.

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NC
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial licence (http://creativecommons.org/licenses/by-nc/4.0), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original article is properly cited. The written permission of Cambridge University Press must be obtained prior to any commercial use.
Copyright
© The Author(s), 2025. Published by Cambridge University Press

1. Introduction and paper contribution

Exoskeleton systems are increasingly adopted in fields ranging from industrial applications to medical rehabilitation, where they offer valuable support and enhancement for human physical capabilities (Mauri et al., Reference Mauri, Lettori, Fusi, Fausti, Mor, Braghin, Legnani and Roveda2019; Dalla Gasperina et al., Reference Dalla Gasperina, Roveda, Pedrocchi, Braghin and Gandolla2021). However, despite their growing use, the absence of standardized procedures for evaluating and benchmarking exoskeleton performance persists. Benchmarking plays a crucial role in establishing standardized metrics for assessing device efficacy, usability, and safety across real-world conditions. Although the importance of such practices is widely recognized, benchmarking within wearable robotics remains underdeveloped (Torricelli et al., Reference Torricelli, Veneman, Gonzalez-Vargas, Mombaur and Remy2019; De Bock et al., Reference De Bock, Ghillebert, Govaerts, Tassignon, Rodriguez-Guerrero, Crea, Veneman, Geeroms, Meeusen and De Pauw2022).

While recent studies have addressed various benchmarking methodologies (De Bock et al., Reference De Bock, Ghillebert, Govaerts, Tassignon, Rodriguez-Guerrero, Crea, Veneman, Geeroms, Meeusen and De Pauw2022), there remains a critical need for instrumented testbeds capable of replicating real-world conditions and providing objective performance assessments across diverse exoskeleton designs. In response, the H2020 EUROBENCH project (Torricelli et al., Reference Torricelli, Veneman, Gonzalez-Vargas, Mombaur and Remy2019) aims to bridge this gap by offering standardized testing environments. Details on the project testbeds and infrastructure are available on its webpage https://github.com/eurobench.

As part of the EUROBENCH initiative, the STEPbySTEP subproject (Maugliani et al., Reference Maugliani, Caimmi, Malosio, Airoldi, Borro, Rosquete, Sergio, Giusino, Fraboni and Ranieri2020) introduces a modular, sensor-equipped, and adaptable staircase testbed. This system is intended to establish a standardized validation process for assessing exoskeleton capabilities, performance, and user experience, specifically during stair ascent and descent, common complex challenges in daily life. Stair climbing scenarios are essential in exoskeleton evaluation due to the high joint torques and power requirements they involve (Farris et al., Reference Farris, Quintero and Goldfarb2012; Tricomi et al., Reference Tricomi, Mossini, Missiroli, Lotti, Zhang, Xiloyannis, Roveda and Masia2023). Most studies to date have focused on non-standard, small-scale stair tests, often neglecting key factors such as stair inclination and step height, which significantly impact exoskeleton performance (Dollar and Herr, Reference Dollar and Herr2008; Benson et al., Reference Benson, Hart, Tussler and van Middendorp2016).

The need for a versatile, adjustable staircase testbed, where inclination and step height can be modified independently of each other, has become increasingly apparent. Such a testbed would enable comprehensive analysis of exoskeleton adaptability across different real-world scenarios, providing a foundation for precise benchmarking protocols (Zhang et al., Reference Zhang, Chen, Huo, Liu, Zhu, Zu, Wang, Chen and Sun2023). STEPbySTEP addresses this need by delivering a highly configurable testbed with integrated measurement tools, tentative metrics, and protocols for thorough performance evaluation and comparison of exoskeleton systems.

Emerging frameworks like the Exoworkathlon have underscored the importance of benchmarking within real-world environments, especially for industrial exoskeletons (Schmidt et al., Reference Schmidt, Böhm, Günther, Kretschmer, Waschek, Baumann, Wörteler and Riedl2017). These frameworks emphasize that benchmarking must extend beyond technical performance to include factors such as cognitive workload, usability, and user perception of comfort – crucial elements for understanding the overall impact of exoskeletons in applied contexts (Tricomi et al., Reference Tricomi, Mossini, Missiroli, Lotti, Zhang, Xiloyannis, Roveda and Masia2023). Including human factors metrics, such as cognitive workload and user discomfort, allows for a holistic approach to exoskeleton evaluation, complementing technical data collected through systems like motion capture, force plates, and EMG (Lafranchi et al., Reference Lafranchi, Botsch, Tonietti and Bicchi2011; Awad et al., Reference Awad, Bae, O’Donnell, De Rossi, Hendron, Sloot, Kudzia, Allen, Holt, Ellis and Walsh2017).

This paper presents a comprehensive benchmarking scheme specifically tailored to the stair climbing scenario. The scheme includes the modular staircase testbed along with associated metrics and evaluation protocols, forming a developing framework for systematic exoskeleton performance assessment in stair-related tasks. The configurability of the testbed – accommodating variable inclinations, step heights, and sensorized support systems – facilitates detailed data collection that is essential for benchmarking (De Bock et al., Reference De Bock, Ghillebert, Govaerts, Tassignon, Rodriguez-Guerrero, Crea, Veneman, Geeroms, Meeusen and De Pauw2022).

In the following sections, we provide an in-depth description of the designed and developed testbed and demonstrate its efficacy through case studies with two lower-limb exoskeletons (LLEs) and one back-support exoskeleton. These studies showcase the benchmarking capabilities of the proposed testbed for evaluating exoskeleton solutions within the stair climbing domain.

2. Staircase

2.1. Design requirements

The design of the testbed (Figure 1) had to comply with the set of technical and functional requirements listed hereafter, which were partly defined by the Eurobench project promoters and partly derived from the authors’ analysis:

  • compliant with technical and ergonomics norms of stair design;

  • a minimum of five steps to allow the acquisition of two complete stance phases for each limb;

  • variable tread and riser lengths, by changing the staircase inclination to test the exoskeletons in different climbing conditions;

  • two people, for example, the patient and the therapist, must be allowed to climb the staircase side by side, leading to a minimum width of the staircase of $ 1500 $ mm;

  • a stair landing at the top of the staircase to allow the exoskeleton users and, if necessary, their companions to turn around and face the descent;

  • motorized system to change the tread and riser lengths to facilitate and speed up the setup operations;

  • a set of sensors to allow the measurement of kinematic and dynamic data while using the exoskeleton;

  • a set of software tools, algorithms, and metrics to quantitatively analyze the user’s motor capabilities and to provide a comprehensive platform to assess the exoskeleton performances.

Figure 1. The STEPbySTEP staircase prototype.

They served as a guide for the mechanical design (Section 2.2), the selection and integration of sensors (Section 2.3), metrics definition (Section 3), and protocol design (Section 4).

It has to be specified that there is no specific target user group w.r.t. the testbed. Therefore, the proposed design (including testing protocols and evaluation metrics) can be applied to any participant.

2.2. Mechanical design and assembly

The STEPbySTEP testbed is a reconfigurable five-step staircase with appropriate adjustments to regulate the relative position among the steps and, consequently, the inclination of the staircase.

By referring to the lateral view of the adjustment mechanism depicted in Figure 2, the parallelogram four-bar linkage $ ABCD $ represents the structure of each stair stringer. Let us denote the vertical distance between $ \overline{AB} $ and $ \overline{CD} $ by h, and the total length of the stringer by d. The inclination of the staircase is $ \alpha =\arcsin \left(h/d\right) $ . Each step is constrained on the stringer by two hinges $ {A}_i $ and $ {B}_i $ . Their positions along the stringer are identical to guarantee that each step is parallel to the ground. The distance between two consecutive steps along the stringer is denoted by l. Consequently, the tread length is $ p=l\cos \alpha $ , and the riser length is $ a=l\sin \alpha $ . Therefore, the tread and the riser lengths can be adjusted by modifying h and l.

Figure 2. Kinematics of the staircase mechanism.

The realized adjusting mechanisms are depicted in Figure 3. The height of the staircase $ h $ is adjusted by a Pantograph lift table (Ameise®HIW 2.0, by Jungheinrich), denoted by actuator for height regulation in Figure 1. Moreover, the distance between two consecutive steps $ d $ is adjusted by modifying the position of two Dowel pins, hinged to each side of a step, along a series of holes present in each stringer.

Figure 3. Adjustments realized in the prototype.

In order to represent the adjustment capabilities of the staircase, let us refer to the international norm ISO14122-3. It imposes the following conditions for realizing ergonomic staircases in civil works:

(2.1) $$ {b}_l\hskip0.35em \leqslant \hskip0.35em v\hskip0.35em \leqslant \hskip0.35em {b}_u, $$

with

  • $ v=p+2a $

  • $ {b}_l=600\mathrm{mm} $ and $ {b}_u=660\mathrm{mm} $ denote the boundary limits of $ v $ .

Let us define the discrepancy of a generic geometrical configuration w.r.t. the acceptable range imposed by norms (2.1) by:

(2.2) $$ e=\left\{\begin{array}{ll}v-{b}_l,& \mathrm{if}\;v<{b}_l\\ {}0,& \mathrm{if}\;{b}_l\le v\le {b}_u\\ {}v-{b}_u,& \mathrm{if}\;v>{b}_u.\end{array}\right. $$

The colored surface shown in Figure 4 shows the influence of geometrical parameters $ p $ and $ a $ on e. The values $ p $ and $ a $ for which $ e $ is equal to 0 comply with the norm ISO14122-3 (2.1). The gray surfaces represent the relationship between $ p $ and a, given a distance $ l $ between two consecutive steps along the stringer, modifying the inclination $ \alpha $ of the staircase. For clarity of representation, only surfaces with $ l=30 $ , 35, and 40 $ cm $ are represented. Intermediate configurations can be obtained by appropriately adjusting Dowel pins shown in Figure 3. Once $ l $ is set, the values of $ p $ and $ a $ that comply with the norm are given by the intersection line between the colored surface and the corresponding gray surface where $ e=0 $ .

Figure 4. Discrepancy of staircase configurations concerning limits imposed by the International 14122–3 norm (1).

2.3. Sensors selection

The testbed is provided with integrated sensors and acquisition systems to measure kinematic and dynamic data. The handrails of the testbed are sensorized with four load cells (Forsentek, F3G-500 N), two for each handrail, connected through the PhidgetBridge 4-Input modules bridge interfaces (Phidgets Inc., Canada), to measure the forces applied by the user wearing the exoskeleton (e.g., a patient using one crutch and the sustain of a handrail) during the ascent or descent of the stairs. Two force plates (P-6000, BTS Bioengineering, Italy) can be properly inserted into any of the steps to measure the ground reaction forces, thanks to the movable wooden tiles that allow for flexible positioning. This setup also permits the insertion of two force plates in the same step, which is useful, for example, when studying both the foot placement and the use of a crutch simultaneously. Load cells and force plates are installed as depicted in Figure 1 and in Figure 5.

Figure 5. Load cells and force plates installed on the staircase.

A motion caption system (Perception Neuron Pro, Noitom Ltd., US) is part of the testbed to measure the user kinematics, available in the installation premises.

3. Metrics

3.1. Physical interaction metrics

One way to assess the performance of an exoskeleton is by investigating the user’s motor capability in performing a task while wearing the exoskeleton. This can be done by analyzing physical measurements, such as the muscular activation pattern and interaction forces. This section discusses the developed metrics calculated from the Surface ElectroMyoGraphic (SEMG) signals and interaction forces measured with two force plates installed in the staircase steps and four load cells integrated into the handrails. Using the IMUs to calculate the duration of the gait phases, that is, stance (ST), swing (SW), double support (DS), and gait cycle (GC), the metrics are calculated for each phase.

SEMG is a method to record the electrical muscle activity during contraction; the stronger the muscle contraction, the higher the electrical activity. Two metrics based on SEMG were used: an index to quantify the level of activity of a muscle, the amount of EMG of muscle $ i $ ( $ {AEMG}^i $ ), and an index quantifying the level of co-contractions between two muscles $ i $ and $ j $ ( $ {ccEMG}^{ij} $ ). These metrics have been previously used in studies related to exoskeletons and gait analysis (Roveda et al., Reference Roveda, Haghshenas, Caimmi, Pedrocchi and Tosatti2019). The novelty in our approach lies in applying these metrics to stair climbing, specifically to assess the effect of exoskeleton assistance on the neuromuscular activation pattern across different gait phases.

The significance of $ {AEMG}^i $ and $ {ccEMG}^{ij} $ on gait functionality is supported by previous research, which has demonstrated that increased co-contraction may be associated with joint stiffness and altered gait patterns, particularly in populations with gait impairments (Damiano et al., Reference Damiano, Martellotta, Sullivan, Granata and Abel2000; Latash, Reference Latash2018). Additionally, reduced or optimized muscle activation patterns have been shown to correlate with more efficient gait mechanics, potentially reducing the metabolic cost of walking (Collins et al., Reference Collins, Wiggin and Sawicki2015; Galle et al., Reference Galle, Malcolm, Collins and De Clercq2017). Thus, these metrics indicate the exoskeleton’s ability to assist in achieving a more efficient and stable gait pattern during stair climbing.

Regarding the ground and handrail interaction forces, several metrics were defined for both feet: i) the maximum vertical and sagittal ground reaction force at the beginning stance ( $ {GFz}_1 $ and $ {GFy}_1 $ ) and ii) at push-off ( $ {GFz}_2 $ and $ {GFy}_2 $ ), iii) the maximum vertical and sagittal handrail reaction forces during double support ( $ HFz $ and $ HFy $ ), and iv) the maximum lateral handrail reaction force during double support ( $ HFx $ ). These metrics are used to assess the subject’s capability to: i) accept the load, ii) push off before the swing phase, iii) maintain equilibrium without exerting excessive forces on the handrails in the vertical and sagittal directions, and iv) maintain equilibrium without exerting excessive lateral forces on the handrails.

The correlation between these ground reaction force (GRF) metrics and specific gait functionalities has been well-documented in the literature. For example, maximum vertical GRF during the stance phase is indicative of the body’s ability to support itself and absorb impact, which is often compromised in individuals with gait deficits (Winter, Reference Winter1991; Vaughan et al., Reference Vaughan, Davis and O’Connor1999). Similarly, the push-off phase is crucial for forward propulsion, and deficiencies in this phase are linked to reduced gait speed and efficiency, often observed in elderly or neurologically impaired populations (Schenck and Kesar, Reference Schenck and Kesar2017; Awad et al., Reference Awad, Lewek, Kesar, Franz and Bowden2020; Alingh et al., Reference Alingh, Groen, Kamphuis, Geurts and Weerdesteyn2021). The handrail forces, on the other hand, can indicate a reliance on external support, which may be necessary for maintaining balance and stability during stair ascent and descent, particularly in individuals with compromised motor control (Novak and Brouwer, Reference Novak and Brouwer2014; Stessman et al., Reference Stessman, Rottenberg and Jacobs2017).

The combined analysis of these metrics provides a comprehensive understanding of the physical interactions between the user and the exoskeleton, offering insights into the device’s ability to support or enhance the user’s natural movement patterns during stair navigation. This multifaceted approach ensures that the evaluation captures both the mechanical and neuromuscular aspects of gait, allowing for a more nuanced assessment of exoskeleton performance.

3.2. Temporal phases

In stair negotiation, defining sub-phases beyond the basic stance, swing, and double-support phases is essential for a deeper understanding of biomechanics. These sub-phases help capture the finer details of movement dynamics during each part of stair ascent and descent. By segmenting the movement into more specific intervals, such as weight acceptance, push-off, or controlled lowering, it becomes possible to analyze the biomechanical contributions during each part of the stair negotiation process. This level of detail is particularly important when studying complex systems like exoskeletons, which aim to replicate or assist human movement. A more detailed sub-phase framework allows for a nuanced evaluation of performance and movement assistance provided by these systems.

In this paper, we adopt the phases and sub-phases definition proposed by Harper et al. (Reference Harper, Wilken and Neptune2018)), as illustrated in Figure 6. The detailed breakdown of phases provides a structured foundation for analyzing exoskeleton performance in stair negotiation tasks.

Figure 6. Figure adapted from Harper et al. (Reference Harper, Wilken and Neptune2018)).

The six phases (and their associated events) during ascent are as follows:

  • Stance phase:

    • Weight acceptance (WA). From ipsilateral foot strike to contralateral toe-off.

    • Pull-up (PU). From the beginning of single-leg support to the mid-swing of the contralateral leg.

    • Forward continuance (FCN). From mid-swing of the contralateral leg to the contralateral foot strike.

  • Swing phase:

    • Push-up (PU). From contralateral foot strike to ipsilateral toe-off.

    • Swing foot clearance (SFC). From ipsilateral toe-off to mid-swing of the ipsilateral leg.

    • Swing foot placement (SFP). From mid-swing of the ipsilateral leg to ipsilateral foot strike.

The five phases during the descent are as follows:

  • Stance phase:

    • Weight acceptance (WA). From right foot contact to left toe-off.

    • Forward continuance (FCN). From left toe-off (single right-leg support) to mid-swing of the left leg.

    • Controlled lowering (CL). From mid-swing of the left leg until right toe-off.

  • Swing phase:

    • Leg pull-through (LP). From right toe-off to mid-swing of the right leg.

    • Foot placement (FP). From mid-swing of the right leg to right foot contact.

Based on this phase definition, 11 temporal metrics can be established (see Table 1). This protocol aims to achieve two primary objectives: first, to compute the temporal metrics (refer to Section 4.2); and second, to provide the raw kinematic data to the end user, enabling its utilization as deemed appropriate. For each sub-phase, for example, the average angles for the different joints can be calculated, as well as the angle at the end of the sub-phase.

Table 1. Temporal metrics

3.3. Human factors metrics

Several literature contributions highlight the importance of adopting a user-centered perspective on exoskeleton design (Bush and ten Hompel, Reference Bush and ten Hompel2017). For instance, among the most critical factors for successful gerontechnology acceptance and uptake, we can find usability (Shore et al., Reference Shore, Power, Hartigan, Schülein, Graf, de Eyto and O’Sullivan2019), the possibility to contemporarily perform motor and cognitive tasks (O’Sullivan et al., Reference O’Sullivan, Power, Virk, Masud, Haider, Christensen, Bai, Cuypers, D’Havé and Vonck2015), and comfort (Wolff et al., Reference Wolff, Parker, Borisoff, Mortenson and Mattie2014). Despite this, human factors and human–technology interaction aspects, such as usability and acceptance, are often overlooked when exoskeletons are designed, and there is currently no framework in the literature that allows for an evaluation of the usability and the impact on users’ perceptions and behaviors of exoskeletons about the task of climbing and descending stairs.

The cognitive workload is an important factor to consider when evaluating the performance of exoskeletons, as it can affect the user’s ability to use the device effectively and safely. The dual-task paradigm is commonly used for assessing cognitive workload, as it involves the simultaneous performance of two tasks that differ in their cognitive demands (Shaw et al., Reference Shaw, Rietschel, Hendershot, Pruziner, Miller, Hatfield and Gentili2018). Usability is another key factor. The System Usability Scale (SUS) is a widely used tool for measuring usability. It assesses the user’s overall satisfaction, and it is widely adopted for exoskeleton usability assessment as well (Orekhov et al., Reference Orekhov, Fang, Cuddeback and Lerner2021). Perceived musculoskeletal pressure and discomfort are, as well, crucial factors to consider when evaluating exoskeletons, as they can affect the user’s ability to use the device comfortably and effectively over an extended period. The local perceived pressure method is a tool that can be used to assess the level of pressure or discomfort experienced by the user when using the exoskeleton (Kermavnar et al., Reference Kermavnar, de Vries, de Looze and O’Sullivan2021). Technology acceptance is another crucial factor to consider when evaluating exoskeletons. van der Laan Acceptance Scale is a tool that can be used to assess the user’s acceptance of the technology. The scale measures the user’s willingness to use the technology and the level of confidence they have in its ability to perform the intended tasks (Van Der Laan et al., Reference Van Der Laan, Heino and De Waard1997). The following cognitive metrics are thus selected, encompassing in particular: cognitive workload as measured by an adapted version of the dual-task paradigm (Plummer-D’Amato et al., Reference Plummer-D’Amato, Altmann, Saracino, Fox, Behrman and Marsiske2008); usability as measured by the System Usability Scale (Lewis and Sauro, Reference Lewis and Sauro2017); perceived musculoskeletal pressure/discomfort as measured by the local perceived pressure method (Hamberg-van Reenen et al., Reference Hamberg-van Reenen, van der Beek, Blatter, van der Grinten and van Mechelen2008), and technology acceptance as assessed by van der Laan Acceptance Scale (Van Der Laan et al., Reference Van Der Laan, Heino and De Waard1997).

4. Protocol design

Protocols exploiting the metrics, proposed in this section, are intended to form the basis for future ad hoc protocols to investigate specific features of the exoskeletons.

4.1. Physical interaction protocol

It is based on the interaction metrics; the rectus femoris (RF) and the biceps femoris (BF) muscles were selected as showcases as highly involved in the task. More muscles should be studied to have a picture of the neuromuscular activation pattern underneath the movement.

The setup is as follows:

  • four-EMG channels (four probes of the Delsys system) to assess bilaterally the activity of the selected muscles;

  • two IMUs (two probes of the Delsys system) are placed at the shanks to identify the gait phases (Salarian et al., Reference Salarian, Russmann, Vingerhoets, Dehollain, Blanc, Burkhard and Aminian2004);

  • two force plates were inserted in the second and third steps to evaluate a complete gait cycle; and

  • two load cells integrated into one of the handrails to measure the interaction forces.

Data acquisition protocol:

  1. 1. User preparation: donning of the exoskeleton and EMG electrode placement.

  2. 2. Resting and standing acquisition: resting acquisition with the subject sitting on a chair.

  3. 3. Place the subject at a one-step distance from the first step of the stairway.

  4. 4. Start data acquisition: launch the data collection software.

  5. 5. Ascending stairs: at the go command, the subject makes a left footstep and climbs the stairs, beginning with the right foot, till reaching the upper platform (usually, subjects use the support of a crutch and the handrail).

  6. 6. Stop the acquisition when the user stands with two feet on the upper platform.

  7. 7. Turning: the subject turns and prepares to descend the stairs.

  8. 8. Descending stairs.

  9. 9. Trials repetitions: steps 3–7 are repeated (four other trials).

  10. 10. Change the staircase inclination: the operator changes the staircase inclination using the platform elevator.

  11. 11. Second set of trials: steps 3–7 are repeated (five trials).

The execution of the protocol, donning the exoskeleton excluded, lasts 30–40 min. Metrics are calculated offline with the software provided to Eurobench.

4.2. Temporal phases protocol

The protocol aims to define the phases of stair negotiation during both ascent and descent, facilitating a detailed understanding of human movements in these contexts. To achieve this, a subset of sensors from a motion capture (mocap) system is utilized, specifically tailored for the knee joint angle calculation and phase definition.

The setup consists of the Perception Neuron Pro system (Perception Neuron Pro, Miami, US), which uses inertial measurement units (IMUs) for capturing kinematic data.

The protocol is defined as follows:

  1. 1. Place the mocap system onto the subject’s body (focusing on the lower limbs). The placement is optimized based on the geometric constraints of each exoskeleton and the recommended locations provided by the capture system manufacturer, allowing for flexibility in sensor positioning.

  2. 2. Perform mocap calibration to ensure accurate data capture.

  3. 3. Position the subject at the starting point, which is one step away from the stairway.

  4. 4. Launch the data collection software to begin recording.

  5. 5. Conduct the experiment, where the subject ascends and descends the stairs using a step-over-step approach.

  6. 6. Stop the data collection once the subject returns to the initial position.

  7. 7. Repeat the procedure five times to ensure reliability and consistency in the data.

The total of five runs takes approximately 22 min.

For the gait events and phase identification, only knee joint angle data have been used (Figure 7). Looking at the figure, the differences between exo and no-exo are particularly noticeable in the three peaks during the ascending phases. These peaks indicate a slightly more rigid and less fluid movement in the exo case, suggesting that the knee does not rotate as much as it does without the exoskeleton.

Figure 7. Knee joint angle of a healthy subject during stair ascent and descent, with (left) and without (right) a lower limb exoskeleton (LLE).

The algorithm used to identify gait events and phases from knee angle data is based on the K-nearest-neighbors (KNN) machine learning (ML) algorithm, with results shown in Figure 10). KNN is a supervised classification method that identifies patterns in data by determining the category of a data point based on its proximity to other, already categorized points. In the context of gait phase identification, the KNN algorithm classifies IMU data into different gait phases by comparing the current data to a previously labeled training dataset.

Two separate ML models were trained: one for stair ascent and another for descent. The training dataset consists of 34 trials from five participants (three females and two males) without exoskeletons. The participants’ mean age was 31.4 years (SD: 7.27), their mean weight was 75.6 kg (SD: 4.39), and their mean height was 1.71 m (SD: .09). The first three participants completed six trials each, while the last two completed eight trials. The trained algorithms were subsequently tested using an exoskeleton, and the results are presented and discussed in Section 5.

This protocol not only enables the calculation of phases and sub-phases but also allows for the assessment of joint angles using IMUs (Recinos et al., Reference Recinos, Abella, Riyaz and Demircan2020). Additionally, it is compatible with other signal acquisition systems, such as EMG and GRF. This combination offers a more comprehensive analysis of biomechanical performance, extending beyond traditional phases like stance, swing, and double support. In this paper, we focus specifically on the definition of temporal phases.

4.3. Human factors protocol

Human factors metrics are assessed through an adapted version of the dual-task paradigm – applied to measure cognitive workload – and through various questionnaires and scales measuring usability, acceptance, and discomfort – administered at the end of the task. Regarding the cognitive workload, this protocol emulates cognitively demanding situations, as participants are asked to simultaneously perform the STEPbySTEP testbed task (going up and down the stairs continuously) and a sound recognition task. The latter consists of presenting a sequence of sound stimuli (phonemes) to the participants, who have to say either “no” (each time the phoneme is different from the previously listened one) or “yes” (each time the phoneme is the same as the previous one). To discern differences in cognitive performance due to the exoskeleton design characteristics, every participant must perform the task under two conditions: the control condition (without wearing the exoskeleton) and the experimental condition (while wearing the exoskeleton). The control condition is necessary since the dual-task paradigm allows the cognitive workload to be assessed by calculating the difference in the number of errors and latency of responses between the single-task baseline (only the sound recognition task) and the dual-task condition (when adding exoskeleton-assisted movement). Therefore, analyzing differences in the cognitive workload levels between the two conditions allows us to discern the cognitive load generated by wearing the exoskeleton alone. Higher latency and higher number of errors correspond to higher levels of cognitive workload (Koch et al., Reference Koch, Poljac, Müller and Kiesel2018).

In other words, our protocol envisages conducting a cognitively demanding task (the sound recognition task) in two different conditions: without wearing the exoskeleton and with the exoskeleton on. Since the sound recognition task does not change between the two conditions, and the only feature that changes between the two conditions is the presence of the exoskeleton, we can infer that changes in cognitive load are caused by the exoskeleton design. To minimize bias, participants were trained to familiarize themselves with the use of the exoskeleton. The STEPbySTEP testbed task (going up and down the stairs continuously) is performed in both conditions to ensure that participants are actively interacting with the exoskeleton. Therefore, analyzing differences in the cognitive workload levels between the two conditions allows us to discern the cognitive load generated by wearing the exoskeleton alone. Higher latency and higher numbers of errors correspond to higher levels of cognitive workload (Koch et al., Reference Koch, Poljac, Müller and Kiesel2018).

W.r.t. the human factors questionnaires, usability is measured through the 10-item System Usability Scale (Lewis and Sauro, Reference Lewis and Sauro2017), a 5-point Likert scale going from 1 = strongly disagree to 5 = strongly agree. Example item: “I think that I would like to use this exoskeleton frequently.”

Acceptance is measured using the 9-item van der Laan Acceptance Scale (Van Der Laan et al., Reference Van Der Laan, Heino and De Waard1997). Participants are asked to rate the exoskeleton on a 5-point semantic differential scale with two opposite adjectives (i.e., useless–useful). The score ranges from −2 (negative opposite) to +2 (positive opposite). Four items related to the usefulness sub-dimension (i.e., “I found the XSPINE prototype to be … Effective – Superfluous”), while five items to the satisfaction sub-dimension (i.e., “I found the XSPINE prototype to be … Pleasant – Unpleasant”).

Discomfort is measured through a modified version of the 12-item Local Perceived Discomfort Scale (Hamberg-van Reenen et al., Reference Hamberg-van Reenen, van der Beek, Blatter, van der Grinten and van Mechelen2008) that includes lower limbs. Participants are asked to rate to which extent the exoskeleton exerted pressure on different body parts on an 11-point scale (0 = no pressure at all to 10 = maximal pressure).

In both conditions, an experimenter registers the participant’s responses manually. Data are collected through the “OpenSesame” software, which records the audio and the participants’ number of errors and response times. Moreover, participants are asked to fill out usability, acceptance, and discomfort scales through an online questionnaire on “Qualtrics” at the end of the task.

The summary of the protocol is detailed below:

  1. 1. A detailed explanation of the experiment and the two phases of the task;

  2. 2. Earphones wearing and volume calibration;

  3. 3. Running “OpenSesame” software;

  4. 4. Place the subject at the starting point;

  5. 5. Trial run: perform the dual task;

  6. 6. Addressing participant’s eventual doubts and questions, eventual volume re-calibration, and “OpenSesame” re-running;

  7. 7. Experimental run: perform the dual task;

  8. 8. Donning the exoskeleton;

  9. 9. Repeat the procedure described from step 3 to step 8 (with the participant wearing the exoskeleton);

  10. 10. Doffing the exoskeleton;

  11. 11. Saving audio recordings and data collected through “OpenSesame” (i.e., response time in milliseconds and error number);

  12. 12. Ask participants to fill out the Local Perceived Discomfort Scale (Hamberg-van Reenen et al., Reference Hamberg-van Reenen, van der Beek, Blatter, van der Grinten and van Mechelen2008) to assess discomfort;

  13. 13. Administration of the System Usability Scale (Lewis and Sauro, Reference Lewis and Sauro2017) and van der Laan Acceptance Scale (Van Der Laan et al., Reference Van Der Laan, Heino and De Waard1997) to assess usability and acceptance.

The total time to perform the human factors protocol (for one subject) is around 20 min.

5. Results

5.1. Methods

All experiments in the EUROBENCH project, including those conducted in the STEPbySTEP project and reported in this paper, were approved by the Ethics Committee of the Consejo Superior de Investigaciones Científicas (CSIC).

All methods were carried out in accordance with relevant guidelines and regulations, with informed consent obtained from all subjects.

5.2. Experimental tests

To evaluate the effectiveness of the proposed protocols, three exoskeletons from different manufacturers were tested: two lower-limb exoskeletons (LLEs) and one back-support exoskeleton (Figure 8). Each exoskeleton is described in the relevant subsection, along with a summary of protocol testing and example results, followed by a brief discussion. While these results are not intended to provide a comprehensive assessment of exoskeleton performance, they serve as a valuable demonstration of the versatility of the testbed and the potential of the proposed metrics. These initial findings illustrate the broad applicability of the approach, highlighting its capacity to effectively evaluate a variety of devices.

Figure 8. (a) A user, wearing TWIN, ascending the STEPbySTEP staircase in the 11 cm step-heigh configuration; a frame of the right double-support phase. (b) BELK exoskeleton while descending the STEPbySTEP staircase. (c) XSPINE back-support exoskeleton.

5.2.1. The TWIN: physical interaction protocol

The physical interaction protocol was tested using the TWIN, an LLE developed by the Italian Institute of Technology (IIT) with support from the Istituto Nazionale per la Assicurazione contro gli Infortuni sul Lavoro (INAIL). TWIN is a novel, modular LLE designed for personal use by individuals with spinal cord injuries. A detailed description of the system and its functionalities is available in Laffranchi et al. (Reference Laffranchi, D’Angella, Vassallo, Piezzo, Canepa, De Giuseppe, Di Salvo, Succi, Cappa, Cerruti, Scarpetta, Cavallaro, Boccardo, D’Angelo, Marchese, Saglia, Guanziroli, Barresi and Semprini2021). Testing was conducted on a healthy subject under two conditions: stair pitches with step heights of 11 and 17 cm.

Example results from two trials in these conditions are presented in both numerical and graphical formats, followed by a brief discussion. Table 2 shows the gait phase durations for each condition. Interestingly, the relative and absolute temporal parameters remain stable despite the increased step height, indicating consistent support from the exoskeleton. Compared to regular walking, the stance phase is slightly prolonged (73% vs. 60% in walking (Alamdari and Krovi, Reference Alamdari, Krovi, Ueda and Kurita2017), while the swing phase is reduced. The gait cycle (GC) duration is approximately doubled relative to normal walking (around 7 vs. 3 s (Alamdari and Krovi, Reference Alamdari, Krovi, Ueda and Kurita2017), although left and right gait remains symmetrical.

Table 2. Gait phases in seconds and gait cycle percentage, relating to a step height of 11 (Cond 1) and 17 cm (Cond 2). St = Stance; SW = Swing; DS = Double Support; GC = Gait Cycle

Figure 9 displays the ground and handrail reaction forces and the EMG activity of the left and right rectus femoris (RF) and biceps femoris (BF) for condition 1 (left image) and condition 2 (central image). The forces are expressed as percentages of the combined weight of the subject and exoskeleton.

Figure 9. Ground and handrail reaction forces and the EMG activity of the rectus and biceps femoris (left panels). All signals are synchronized, and the gait phases are shown for clarity. In the right panel: an example of comparison between data of condition 2 (step height = 17 cm) and condition 1 (step height = 11 cm), which is taken equal to 1 as reference.

In both conditions 1 and 2, the left and right muscle activation patterns appear physiological (lower graphs), indicating that the subject is comfortable using the exoskeleton without requiring compensatory movements. All reaction forces (Figure 9, left panels, upper figures) exhibit repeatable patterns across all gait phases. Differences in the handrail force component are observed in the double-support (DS) phases. Specifically, HFz is 10 times larger during the left DS phase (i.e., when the left foot precedes the right) than during the right DS phase. Additionally, the maximum vertical component of the ground reaction force (GRF) does not reach 100% of the total weight, as the subject leans on the handrail during the left DS phase but leans on a cane during the right DS phase (note that cane forces were not measured in this experiment). All the above considerations apply to both conditions.

The physical interaction metrics (Section 3.1) for condition 2 are plotted in Figure 9 (right panel) against those for condition 1, used here as reference values. This data presentation format emphasizes differences between the two conditions, with the following observations. The vertical and lateral components of the handrail forces are similar across the two conditions (see Figure 9, right panel, upper image). In contrast, the sagittal component HFy (green line in Figure 9, left panels) in condition 2 is larger during the right DS phase and smaller during the left DS phase compared to condition 1 (see Figure 9, right panel, upper image). This suggests that with higher steps, the user exerts more pull on the handrail when the right foot precedes the left, while less pull is needed when the left foot precedes the right, as seen with lower steps. Correspondingly, left and right GFy1 and GFy2 (Figure 9, right panel, central image) have higher values in condition 2 than in condition 1, indicating that braking and pushing forces increase with step height. Finally, RF EMG activity also rises with step height (Figure 9, right panel, lower image), which aligns with the increased joint torques and power requirements associated with ascending higher steps (Dollar and Herr, Reference Dollar and Herr2008).

5.2.2. BELK – temporal phases protocol

The temporal phases protocol was tested using the BELK, an LLE developed by Gogoa (https://www.gogoa.eu/belk). BELK is an active knee exoskeleton (Figure 8b) designed to support the rehabilitation of patients recovering from surgical procedures or managing neurological conditions. The system allows customization of the angular movement range and walking speed, and it can adjust the level of assistance by generating a supportive force field tailored to the patient’s individual needs and recovery progress.

The objective of this test was to evaluate the effectiveness of the ML algorithm in identifying temporal phases. The models were trained on a dataset of healthy subjects ascending and descending stairs without exoskeletons. The goal was to assess the performance of the model when applied to subjects using an exoskeleton, even though it was initially trained on data from individuals without one. To demonstrate the validity of the algorithm, the identified sub-phases are highlighted in the knee joint angle graph during the ascending phase (Figure 10). Notably, the sub-phase identification using KNN observed in trials with the exoskeleton (Figure 10, right) shows a clear definition when compared to the manually labeled ground truth of a subject ascending stairs without an exoskeleton (Figure 10 left).

Figure 10. Ascending mocap data example. Ground truth sub-phases manually labeled for a healthy user without an exoskeleton (left) compared to ML algorithm predictions with an exoskeleton (right).

Please insert figure 10 here

Table 3 shows the ascending computed temporal metrics for Figure 10. The figure does not present descending data but the ML algorithm works perfectly well for descending scenarios as well.

Table 3. Temporal ascending metrics for Figure 10. Gray cells correspond to the % with respect to the gait cycle. Stance is the sum of the stance sub-phases, and swing is the sum of the swing sub-phases

As seen in Figure 10, the ML algorithm correctly identifies the gait phases. Failures in identification may occur if the exoskeleton cannot adequately replicate typical knee motion. When analyzing the walking pattern of a healthy individual without an exoskeleton ascending stairs, certain patterns emerge in the peaks of the knee rotation signal, which correlate with knee–ankle movements essential for stair ascent and descent. If a user with an exoskeleton that has incorrect ascending or descending movement patterns performs the test, these patterns may change, as either the knee or the ankle may not rotate properly. This can be observed visually as a “rigid” movement of the knee and ankle, indicating limited rotation. These results could help exoskeleton designers consider necessary mechanical and software modifications to better mimic natural joint motions during stair ascent and descent.

Using this method, particularly alongside other biomechanical measures such as kinematics and dynamics, could lead to more detailed analysis, potentially supporting more precise benchmarking of exoskeleton effectiveness across various movement phases. The algorithm shows effectiveness in detecting temporal phases, especially when knee motion during exoskeleton use closely mirrors physiological patterns. However, some limitations may emerge if movement patterns deviate significantly from the expected norm. For example, variations could arise if the ankle movement is restricted or actively controlled, potentially impacting algorithm performance. In such cases, retraining the model for specific exoskeleton configurations could be beneficial. Alternatively, if the system allows adjustments to assistance settings, these results could guide the optimization of those settings.

This approach may also find relevance beyond rehabilitation. Accurately identifying sub-phases could contribute to advancements in assistive technologies for individuals with chronic conditions. Additionally, applying machine learning algorithms trained on healthy individuals to those with varying movement patterns suggests a degree of adaptability, which could improve exoskeleton effectiveness across diverse user populations.

5.2.3. XSPINE – the human factors protocol

A back-support exoskeleton, XSPINE, was tested, with a complete description available in Roveda et al. (Reference Roveda, Savani, Arlati, Dinon, Legnani and Tosatti2020, Reference Roveda, Pesenti, Rossi, Covarrubias, Pedrocchi, Braghin and Gandolla2022)).

The human factors metrics and protocol were applied to evaluate the exoskeleton. The dataset included six participants (three men and three women, aged between 25 and 35) who voluntarily participated in the study and completed the previously described human factors protocol. Table 4 provides descriptive statistics for the scales used with participants. Results indicate that participants rated the XSPINE exoskeleton positively in terms of usability (4.0 ± .7), with favorable assessments of its usefulness (.4 ± .4) and overall satisfaction (.9 ± .8).

Table 4. Descriptive statistics for usability, acceptance, and local perceived discomfort scales

In addition, the exoskeleton was reported to exert very low to low pressure across various body areas. The shoulders (3.4 ± 1.8), the right lower back and hip (2.8 ± 2.0), and the left lower back and hip (2.7 ± 2.0) were identified as the most uncomfortable areas.

Regarding cognitive workload, the variables used to operationalize cognitive performance are the average response time and the number of errors in both conditions (without and with the exoskeleton). While we acknowledge that our statistical power may be limited, we ran a paired-sample t-test on IBM SPSS 22 to analyze the average score difference of each participant in the two conditions – since the dual-task paradigm is a within-subject experimental design. Table 5 displays the descriptives and results of the paired-sample t-test. The results show no significant performance difference between the two conditions for both response time and the number of errors. In particular, the response time discrepancy between the no-exoskeleton (1128 ± 203 ms) and exoskeleton one (1072 ± 161 ms) conditions displays no statistically significant difference (t = 2.3; p  $ > $  .05). The same applies to the difference in the number of errors between the no-exoskeleton condition (.3 ± .5) and with exoskeleton 2 (1.7 ± 1.6), which is not statistically relevant (t = −2.0; p  $ > $  .05). Therefore, these preliminary findings point to a possible trend for future research, so that wearing the exoskeleton does not seem to influence users’ cognitive workload and performance for both exoskeletons tested in this study. This is particularly relevant for the tasks envisaged by the STEPbySTEP testbed described previously (e.g., climbing and descending stairs).

Table 5. Descriptive and paired-sample t-test results for response time and error number

5.3. Discussion

Evaluating exoskeleton performance across varied physical tasks is essential for understanding their capabilities and refining the evaluation protocols that enable such testing. Stair climbing is of particular interest due to its prevalence in daily life and the unique biomechanical challenges it presents. While research on stair ascent and descent has historically been limited, the development of exoskeletons capable of addressing these complex movements highlights the need for structured evaluation protocols as exoskeleton technology advances.

In this work, we introduced a reconfigurable, sensorized staircase testbed designed for evaluating and benchmarking exoskeleton functionality during stair negotiation. While the testbed itself provides a comprehensive and adaptable platform for assessing exoskeletons, our focus here was on testing the protocols and demonstrating their potential. The preliminary results show that the protocols can capture relevant biomechanical and cognitive metrics. These initial measurements are tentative and serve as a showcase for the platform’s ability to support diverse exoskeleton types rather than offering definitive benchmarks.

We applied the protocols to two lower-limb exoskeletons (LLEs) with distinct characteristics. TWIN, a modular exoskeleton developed by the Italian Institute of Technology, demonstrated reliable gait phase consistency, but its slower-than-natural gait – particularly in its extended stance phase (7 s compared to the typical 3 s for unaided walking) – illustrated the testbed ability to capture unique device characteristics. In contrast, BELK, an active knee exoskeleton by Gogoa, achieved faster, more dynamic movements. These differences highlight the versatility of the testbed in accommodating various device configurations and performance levels.

The temporal phases protocol was tested with BELK using an ML algorithm trained on healthy subject data. Results demonstrated that the algorithm could identify gait sub-phases even when using the exoskeleton, suggesting that the protocol could work across different user profiles and devices. However, further adaptations – such as retraining the algorithm with data from exoskeleton users – could improve accuracy when movement patterns deviate from physiological norms, such as limited ankle mobility due to the device.

Additionally, we tested cognitive workload metrics to assess human–exoskeleton interactions. Preliminary tests using the XSPINE back-support exoskeleton showed no significant difference in cognitive performance between exoskeleton and no-exoskeleton conditions, illustrating the testbed potential to integrate cognitive metrics into exoskeleton evaluation. While these results do not provide conclusive trends, they emphasize the importance of considering cognitive aspects in future research on exoskeleton performance, especially for tasks like stair navigation where both physical and cognitive demands are critical.

The STEPbySTEP testbed, along with its protocols and metrics, is now available at the EUROBENCH facility in Brunete, Spain (https://github.com/eurobench). This standardized environment offers a valuable platform for exoskeleton testing and benchmarking, supporting the field as new devices for stair navigation continue to emerge. Currently, there are protocols and metrics for validating exoskeletons, but, to the best of our knowledge, there is a lack of standardized benchmarking frameworks, specifically for these devices. The Eurobench initiative was created to address this gap by developing an infrastructure dedicated to the benchmarking of wearable robotics, ensuring a more consistent and comparable assessment across devices and real-world contexts. The protocols and metrics presented in our work align with Eurobench guidelines and requirements, ensuring compatibility with this benchmarking framework. Such a testbed, together with the other testbeds within the EUROBENCH project, can enhance exoskeleton development. In particular, from the industrial and commercialization point of view, it will be possible to perform standardized and structured evaluations of the performance of an exoskeleton product, being market ready.

While this study highlights the potential of the STEPbySTEP testbed and its associated protocols to capture meaningful metrics for stair climbing scenarios, certain limitations warrant further investigation. Two main aspects deserve particular consideration.

First, the protocols and metrics were tested using exoskeletons worn only by healthy subjects. While we expect kinematics and dynamics to differ when used by individuals with neurological impairments, the testbed is fully capable of collecting and processing relevant data. The results obtained will naturally vary depending on the user group, but this variation is precisely what the platform is designed to capture and analyze.

Second, the number of exoskeletons used was limited to two lower limbs and one back-support device. However, this does not constrain the testbed applicability. The modular and customizable design of the staircase allows the evaluation of a broad range of exoskeletons, including those developed for non-rehabilitative applications. By adjusting the step configuration, the platform can be adapted for testing high-performance exoskeletons or even legged robots (e.g., humanoids) used in military, industrial, or space environments. The testbed supports step heights ranging from 11 cm to over 1 m, enabling assessments across different operational contexts.

One final limitation of this study is the lack of a direct comparison between two exoskeletons using the same protocol, making it impossible to determine any particular strengths or weaknesses of each device. However, the primary objective of this study was to assess the scale along with the protocols, verifying their applicability and sensitivity. Nevertheless, we are confident that we have developed a benchmarking framework capable of highlighting both the strengths and weaknesses of different exoskeletons. Supporting this claim, the protocols and metrics have proven to be sensitive in detecting differences between subjects using the same exoskeleton, as well as intra-subject differences when applying the scale in two different setups.

In summary, despite these limitations, the STEPbySTEP testbed and its associated protocols demonstrate the potential to capture important metrics in realistic stair climbing scenarios. The observed gait patterns, usability metrics, and biomechanical data provide insights into how exoskeletons might perform in real-world applications, such as rehabilitation clinics and industrial settings. To bridge the gap between controlled testing and real-world application, the staircase testbed can be configured to replicate specific real-life scenarios. This adaptability allows for early identification of potential challenges and facilitates the development of targeted solutions, improving exoskeleton integration into diverse environments. These protocols, metrics, and the adaptability of the testbed together provide a foundation for future research, enabling the development of standardized benchmarks, supporting exoskeleton improvement, and contributing to the advancement of both rehabilitative and assistive exoskeleton technologies.

6. Conclusions

This paper presented STEPbySTEP, a reconfigurable, sensorized staircase testbed created to support standardized exoskeleton evaluation within Eurobench. Through trials with three exoskeletons, we showcased the adaptability of our protocols and tentative metrics, demonstrating the platform’s potential for versatile benchmarking across various devices. While preliminary, these findings underscore the testbed promise as a flexible evaluation tool. Future work will focus on refining protocols and expanding metrics to support broader benchmarking applications, advancing the development of standardized approaches for exoskeleton performance assessment.

Data availability statement

Data will be made available upon request. Please contact .

Acknowledgments

The authors want to thank Roberto Bozzi and Joao Carlos Dalberto (CNR-STIIMA) for their support in designing and building the testbed; Indya Ceroni, Stefano Maludrottu, Angelo Musumarra, Silvia Scarpetta, Marianna Semprini, Christian Vassallo, and Gaia Zinni (Rehab Technologies Lab, IIT) for the management and control of the TWIN exoskeleton; Maurizio Ferrarin, Giorgia Fusaroli, Johanna Jonsdottir, and Tiziana Lencioni (LAMoBIR and LaRiCE at IRCCS Fondazione Don Carlo Gnocchi) for their expertise and management of the acquisitions.

Authors contribution

The authors contributed to the paper as follows: Conceptualization, M.C., M.M., D.B., F.F., and L.R.; Realization of the testbed, M.C., N.M., M.M., F.A., T.D., D.B., M.E., I.D., S.A., and L.R.; Software, M.C., N.M., D.B., G.P., F.F., M.M., A.A.S., and L.R.; Supervision, M.C., M.M., D.B., L.P., and L.R.; Paper writing and editing, M.C., M.M., D.B., G.P., F.F., and L.R.

Funding statement

This paper has received funding from the European Union’s Horizon 2020 research and innovation program, via an Open Call issued and executed under Project EUROBENCH (grant agreement no. 779963) – STEPbySTEP project, XSPINE project, and REMOTe_XSPINE project.

This paper has received funding from the EIT Manufacturing – SUPERHUMAN project.

Competing interests

The authors declare no competing interests exist.

Ethical standard

Ethical approval for this study was provided by the CSIC Ethics Committee (approval number 091/2021).

References

Alamdari, A and Krovi, V (2017) Chapter two - a review of computational musculoskeletal analysis of human lower extremities. In Ueda, J and Kurita, Y (eds), Human Modelling for Bio-Inspired Robotics: Academic Press, pp. 3773. https://doi.org/10.1016/B978-0-12-803137-7.00003-3.Google Scholar
Alingh, JF, Groen, BE, Kamphuis, JF, Geurts, ACH and Weerdesteyn, V (2021) Task-specific training for improving propulsion symmetry and gait speed in people in the chronic phase after stroke: A proof-of-concept study. Journal of Neuroengineering and Rehabilitation 18(1), 69CrossRefGoogle ScholarPubMed
Awad, LN, Bae, JH, O’Donnell, K, De Rossi, SMM, Hendron, K, Sloot, LH, Kudzia, P, Allen, S, Holt, KG, Ellis, TD and Walsh, CJ (2017) A soft robotic exosuit improves walking in patients after stroke. Science Translational Medicine 9(400), eaai9084Google ScholarPubMed
Awad, LN, Lewek, MD, Kesar, TM, Franz, JR and Bowden, MG (2020) These legs were made for propulsion: Advancing the diagnosis and treatment of post-stroke propulsion deficits. Journal of Neuroengineering and Rehabilitation 17(1), 139Google ScholarPubMed
Benson, I, Hart, K, Tussler, D and van Middendorp, JJ (2016) Lower-limb exoskeletons for individuals with chronic spinal cord injury: Findings from a feasibility study. Clinical Rehabilitation 30(1), 7384Google ScholarPubMed
Bush, P and ten Hompel, S (2017) An integrated craft and design approach for wearable orthoses. Design for Health 1(1), 86104CrossRefGoogle Scholar
Collins, SH, Wiggin, MB and Sawicki, GS (2015) Reducing the metabolic cost of walking with an untethered exoskeleton. Nature 522(7555), 212215Google Scholar
Dalla Gasperina, S, Roveda, L, Pedrocchi, A, Braghin, F and Gandolla, M (2021) Review on patient-cooperative control strategies for upper-limb rehabilitation exoskeletons. Frontiers in Robotics and AI 8Google ScholarPubMed
Damiano, DL, Martellotta, TL, Sullivan, DJ, Granata, KP and Abel, MF (2000) Muscle force production and functional performance in spastic cerebral palsy: Relationship of cocontraction. Archives of Physical Medicine and Rehabilitation 81(7), 895900. https://doi.org/10.1053/apmr.2000.5579Google ScholarPubMed
De Bock, S, Ghillebert, J, Govaerts, R, Tassignon, B, Rodriguez-Guerrero, C, Crea, S, Veneman, J, Geeroms, J, Meeusen, R and De Pauw, K (2022) Benchmarking occupational exoskeletons: An evidence mapping systematic review. Applied Ergonomics 98, 103582Google ScholarPubMed
Dollar, AM and Herr, H (2008) Lower extremity exoskeletons and active orthoses: Challenges and state-of-the-art. IEEE Transactions on Robotics 24(1), 144158Google Scholar
Farris, RJ, Quintero, HA and Goldfarb, M (2012) Performance evaluation of a lower limb exoskeleton for stair ascent and descent with paraplegia. 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 19081911Google ScholarPubMed
Galle, S, Malcolm, P, Collins, SH and De Clercq, D (2017) Reducing the metabolic cost of walking with an ankle exoskeleton: Interaction between actuation timing and power. Journal of Neuroengineering and Rehabilitation 14(1), 35. https://doi.org/10.1186/s12984-017-0235-0Google ScholarPubMed
Hamberg-van Reenen, HH, van der Beek, AJ, Blatter, BM, van der Grinten, MP and van Mechelen, W (2008) Does musculoskeletal discomfort at work predict future musculoskeletal pain? Ergonomics 51(5), 637648CrossRefGoogle ScholarPubMed
Harper, NG, Wilken, JM and Neptune, RR (2018) Muscle function and coordination of stair ascent. Journal of Biomechanical Engineering 140(1), 10Google ScholarPubMed
Kermavnar, T, de Vries, AW, de Looze, MP and O’Sullivan, LW (2021) Effects of industrial back-support exoskeletons on body loading and user experience: An updated systematic review. Ergonomics 64(6), 685711Google ScholarPubMed
Koch, I, Poljac, E, Müller, H and Kiesel, A (2018) Cognitive structure, flexibility, and plasticity in human multitasking—An integrative review of dual-task and task-switching research. Psychological Bulletin 144(6), 557Google Scholar
Laffranchi, M, D’Angella, S, Vassallo, C, Piezzo, C, Canepa, M, De Giuseppe, S, Di Salvo, M, Succi, A, Cappa, S, Cerruti, G, Scarpetta, S, Cavallaro, L, Boccardo, N, D’Angelo, M, Marchese, C, Saglia, JA, Guanziroli, E, Barresi, G, Semprini, M, et al. (2021) User-Centered design and development of the modular TWIN lower limb exoskeleton. Frontiers in Neurorobotics 15, 709731Google ScholarPubMed
Lafranchi, M, Botsch, M, Tonietti, G and Bicchi, A (2011) Sizing actuators in wearable robotics: A data-driven model for the actuation of the lower limb in human locomotion. 2011 IEEE International Conference on Rehabilitation Robotics, 18Google Scholar
Latash, ML (2018) Muscle coactivation: Definitions, mechanisms, and functions. Journal of Neurophysiology 120(1), 88104. https://doi.org/10.1152/jn.00084.2018CrossRefGoogle ScholarPubMed
Lewis, JR and Sauro, J (2017) Revisiting the factor structure of the system usability scale. Journal of Usability Studies 12(4), 183192Google Scholar
Maugliani, N, Caimmi, M, Malosio, M, Airoldi, F, Borro, D, Rosquete, D, Sergio, A, Giusino, D, Fraboni, F, Ranieri, G, et al. (2020) Lower-limbs exoskeletons benchmark exploiting a stairs-based testbed: The stepbystep project. International Symposium on Wearable Robotics, 603608Google Scholar
Mauri, A, Lettori, J, Fusi, G, Fausti, D, Mor, M, Braghin, F, Legnani, G and Roveda, L (2019) Mechanical and control design of an industrial exoskeleton for advanced human empowering in heavy parts manipulation tasks. Robotics 8(3), 65CrossRefGoogle Scholar
Novak, AC and Brouwer, B (2014) Relationship between stair ambulation with and without a handrail and Centre of pressure velocities during stair ascent and descent. Gait & Posture 39(1), 158164. https://doi.org/10.1016/j.gaitpost.2013.06.005Google Scholar
O’Sullivan, L, Power, V, Virk, G, Masud, N, Haider, U, Christensen, S, Bai, S, Cuypers, L, D’Havé, M and Vonck, K (2015) End user needs elicitation for a full-body exoskeleton to assist the elderly. Procedia Manufacturing 3, 14031409Google Scholar
Orekhov, G, Fang, Y, Cuddeback, CF and Lerner, ZF (2021) Usability and performance validation of an ultra-lightweight and versatile untethered robotic ankle exoskeleton. Journal of Neuroengineering and Rehabilitation 18(1), 116CrossRefGoogle ScholarPubMed
Plummer-D’Amato, P, Altmann, LJP, Saracino, D, Fox, E, Behrman, AL and Marsiske, M (2008) Interactions between cognitive tasks and gait after stroke: A dual task study. Gait and Posture 27(4), 683688Google ScholarPubMed
Recinos, E, Abella, J, Riyaz, S and Demircan, E (2020) Real-time vertical ground reaction force estimation in a unified simulation framework using inertial measurement unit sensors. Robotics, 9(4), 88. MDPI AG. ISSN:2218-6581. http://doi.org/10.3390/robotics9040088.Google Scholar
Roveda, L, Haghshenas, S, Caimmi, M, Pedrocchi, N and Tosatti, L (2019) Assisting operators in heavy industrial tasks: On the design of an optimized cooperative impedance fuzzy-controller with embedded safety rules. Frontiers in Robotics and AI 6, 75CrossRefGoogle ScholarPubMed
Roveda, L, Pesenti, M, Rossi, M, Covarrubias, M, Pedrocchi, A, Braghin, F and Gandolla, M (2022) User-centered back-support exoskeleton: Design and prototyping. CIRP Annals.CrossRefGoogle Scholar
Roveda, L, Savani, L, Arlati, S, Dinon, T, Legnani, G and Tosatti, LM (2020) Design methodology of an active back-support exoskeleton with adaptable backbone-based kinematics. International Journal of Industrial Ergonomics 79, 102991CrossRefGoogle Scholar
Salarian, A, Russmann, H, Vingerhoets, FJ, Dehollain, C, Blanc, Y, Burkhard, PR and Aminian, K (2004) Gait assessment in Parkinson’s disease: Toward an ambulatory system for long-term monitoring. IEEE Transactions on Biomedical Engineering 51(8), 14341443CrossRefGoogle ScholarPubMed
Schenck, C and Kesar, TM (2017) Effects of unilateral real-time biofeedback on propulsive forces during gait. Journal of Neuroengineering and Rehabilitation 14(1)CrossRefGoogle ScholarPubMed
Schmidt, B, Böhm, V, Günther, T, Kretschmer, F, Waschek, S, Baumann, M, Wörteler, M and Riedl, M (2017) Wearable technologies for factory workers: Exoskeletons, wearable sensors, and smart personal protective equipment in industry 4.0. Wearable Technologies 1(2), 1732Google Scholar
Shaw, EP, Rietschel, JC, Hendershot, BD, Pruziner, AL, Miller, MW, Hatfield, BD and Gentili, RJ (2018) Measurement of attentional reserve and mental effort for cognitive workload assessment under various task demands during dual-task walking. Biological Psychology 134, 3951Google ScholarPubMed
Shore, L, Power, V, Hartigan, B, Schülein, S, Graf, E, de Eyto, A and O’Sullivan, L (2019) Exoscore: A design tool to evaluate factors associated with technology acceptance of soft lower limb exosuits by older adults. Human Factors, 120Google ScholarPubMed
Stessman, J, Rottenberg, Y and Jacobs, JM (2017) Climbing stairs, handrail use, and survival. The Journal of Nutrition, Health & Aging 21(2), 195201Google ScholarPubMed
Torricelli, D, Veneman, J, Gonzalez-Vargas, J, Mombaur, K and Remy, DC (2019) Editorial: Assessing bipedal locomotion: Towards replicable benchmarks for robotic and robot-assisted locomotion. Frontiers in Neurorobotics 13Google ScholarPubMed
Tricomi, E, Mossini, M, Missiroli, F, Lotti, N, Zhang, X, Xiloyannis, M, Roveda, L and Masia, L (2023) Environment-based assistance modulation for a hip exosuit via computer vision. IEEE Robotics and Automation Letters 8(5), 25502557Google Scholar
Van Der Laan, JD, Heino, A and De Waard, D (1997) A simple procedure for the assessment of acceptance of advanced transport telematics. Transportation Research Part C: Emerging Technologies 5(1), 110Google Scholar
Vaughan, CL, Davis, BL and O’Connor, JC (1999) Dynamics of Human Gait. Human Kinetics.Google Scholar
Winter, DA (1991) The Biomechanics and Motor Control of Human Gait: Normal, Elderly and Pathological. University of Waterloo Press.Google Scholar
Wolff, J, Parker, C, Borisoff, J, Mortenson, BW and Mattie, J (2014) A survey of stakeholder perspectives on exoskeleton technology. Journal of Neuroengineering and Rehabilitation 11Google ScholarPubMed
Zhang, X, Chen, X, Huo, B, Liu, C, Zhu, X, Zu, Y, Wang, X, Chen, X and Sun, Q (2023) An integrated evaluation approach of wearable lower limb exoskeletons for human performance augmentation. Scientific Reports 13(1), 4251CrossRefGoogle ScholarPubMed
Figure 0

Figure 1. The STEPbySTEP staircase prototype.

Figure 1

Figure 2. Kinematics of the staircase mechanism.

Figure 2

Figure 3. Adjustments realized in the prototype.

Figure 3

Figure 4. Discrepancy of staircase configurations concerning limits imposed by the International 14122–3 norm (1).

Figure 4

Figure 5. Load cells and force plates installed on the staircase.

Figure 5

Figure 6. Figure adapted from Harper et al. (2018)).

Figure 6

Table 1. Temporal metrics

Figure 7

Figure 7. Knee joint angle of a healthy subject during stair ascent and descent, with (left) and without (right) a lower limb exoskeleton (LLE).

Figure 8

Figure 8. (a) A user, wearing TWIN, ascending the STEPbySTEP staircase in the 11 cm step-heigh configuration; a frame of the right double-support phase. (b) BELK exoskeleton while descending the STEPbySTEP staircase. (c) XSPINE back-support exoskeleton.

Figure 9

Table 2. Gait phases in seconds and gait cycle percentage, relating to a step height of 11 (Cond 1) and 17 cm (Cond 2). St = Stance; SW = Swing; DS = Double Support; GC = Gait Cycle

Figure 10

Figure 9. Ground and handrail reaction forces and the EMG activity of the rectus and biceps femoris (left panels). All signals are synchronized, and the gait phases are shown for clarity. In the right panel: an example of comparison between data of condition 2 (step height = 17 cm) and condition 1 (step height = 11 cm), which is taken equal to 1 as reference.

Figure 11

Figure 10. Ascending mocap data example. Ground truth sub-phases manually labeled for a healthy user without an exoskeleton (left) compared to ML algorithm predictions with an exoskeleton (right).

Figure 12

Table 3. Temporal ascending metrics for Figure 10. Gray cells correspond to the % with respect to the gait cycle. Stance is the sum of the stance sub-phases, and swing is the sum of the swing sub-phases

Figure 13

Table 4. Descriptive statistics for usability, acceptance, and local perceived discomfort scales

Figure 14

Table 5. Descriptive and paired-sample t-test results for response time and error number