Abstract: In this digital age, human activity recognition (HAR) plays an increasingly important role in almost all aspects of life to improve people's quality of life, such as auxiliary medical care, rehabilitation technology, and interactive entertainment. Besides external sensing, sensor-based internal sensing for HAR is also intensively studied.A large body of research involves recognizing various kinds of everyday human activities, including walking, standing, jumping, and performing gestures. HAR research relies on large amounts of data, which includes the collection of laboratory data collections that meet in-house research goals, as well as the usage of external and public databases to verify models and methods. Therefore, data collection is an essential part of our entire HAR research work, for which we will detail this extensive progress. Many public HAR datasets are available online, providing various sorts of collected data, some of which have some similarities with our in-house data acquisition in terms of purpose, sensor selection, or protocol design. For instance, the Opportunity benchmark database (Chavarriaga et al., 2013) contains naturalistic daily living activities recorded with a large set of on-body sensors. The UniMiB-SHAR dataset (Micucci et al., 2017) includes 11,771 samples of both human activities and falls divided into 17 fine-grained classes. The GaitAnalysisDataBase (Loose and Bolmgren, 2019) contains 3D walking kinematics and muscle activity data from healthy adults walking at normal, slow or fast pace on the flat ground or at incremental speeds on a treadmill. The ReadWorld dataset (Sztyler and Stuckenschmidt, 2016) covers acceleration, GPS, gyroscope, light, magnetic field, and sound level data of the activities climbing stairs down and up, jumping, lying, standing, sitting, running/jogging, and walking of 15 subjects. The FORTH-TRACE dataset (Karagiannaki et al., 2016) is collected from 15 participants wearing five Shimmer wearable sensor nodes on the left/right wrist, the torso, the right thigh, and the left ankle. The ENABL3S dataset (Hu et al., 2018) contains bilateral electromyography (EMG) and joint and limb kinematics recorded from wearable sensors for ten able-bodied individuals as they freely transitioned between sitting, standing, and five walking-related activities. In this article, we disclose our in-house collected sensor-based dataset, CSL-SHARE (Cognitive Systems Lab Sensor-based Human Activity REcordings). Based on the improvement of the recording plan and organization through the experience gathered from the pilot datasets' collection of CSL17 (1 subject, 7 activities of daily living, 15 minutes) and CSL18 (4 subjects, 21 activities of daily living and sports, 90 minutes), the CSL-SHARE dataset covers 21 types of activities of daily living and sports from 20 subjects in a total time of 691 minutes, of which 363 minutes are segmented and annotated. In this dataset, we used two 3D-accelerometers, two 3D-gyroscopes, four surface electromyography (sEMG) sensors, one 2D-electrogoniometer and one airborne microphone integrated into a knee bandage, bringing the total number of channels to 19, as these sensors can provide usable and reliable biosignals for HAR research, gait analysis, and health assessment according to existing studies, such as (Mathie et al., 2003), (Rebelo et al., 2013), (Kwapisz et al., 2010), (Rowe et al., 2000), (Whittle, 1996), and(Teague et al., 2016). We also tried to use a piezoelectric microphone and a force sensor for sensing the acoustic and physical pressure signals from the knee during the acquisition. Nevertheless, in subsequent analysis and research, we did not have evidence to support their contribution to HAR research. Therefore, we removed these two channels of signal from the public dataset. In addition, although our two pilot datasets mentioned above, CSL17 and CSL18, are not publicly available due to the relatively smaller data volume, they can also be obtained from us for scientific research purposes.We chose the biosignalsplux Researcher Kit 1 with the selected various types of sensors supplied together. One hub from the kit records biosignals from 8 channels, each up to 16 bits, simultaneously. Since we needed to record over 20 channels, we connected 3 hubs via synchronization cables that connect the hubs and synchronize all channels automatically between the hubs at the beginning of each recording session, which ensured the synchronicity during the entire recording sessions.The sensor positioning on the right-leg-worn knee bandage was decided in collaboration with kinesiologists of the Institute of Sport and Sports Science at Karlsruhe Institute of Technology based on their substantial research experience in knee kinematics (Stetter et al., 2018) (Stetter et al., 2019 to capture ambulation activities. The CSL-SHARE sensor positions and their measured muscles/body parts are listed as follows:• 3D-accelerometer 1 (upper): thigh, proximal ventral;• 3D-accelerometer 2 (lower): shank, distal ventral; • 3D-gyroscope 1 (upper): thigh, proximal ventral;• 3D-gyroscope 2 (lower): shank, distal ventral;• EMG 1 (upper-front): musculus vastus medialis;• EMG 2 (lower-front): musculus tibialis anterior; • EMG 3 (upper-back): musculus biceps femoris;• EMG 4 (lower-back): musculus gastrocnemius;• 2D-electrogoniometer (lateral): knee of the right leg;• Airborne microphone (lateral): knee of the right leg.EMG and microphone signals were recorded with a sampling rate of 1000Hz and all other signals with 100Hz. The low-sampled channels at 100Hz were up-sampled to 1000Hz to be synchronized and aligned with high-sampled channels. All channels have a quantization resolution of 16 Bits.We developed a software called Activity Signal Kit (ASK) with a Graphical User Interface (GUI) and multi-functionalities using the driver library provided by biosignalsplux, as introduced in (Liu and Schultz, 2018). ASK automatically connects and synchronizes several recording hubs, then collects up to 24-channel sensor data from all hubs simultaneously and continuously. All recorded data are archived automatically in HDF5 files with the filename of dates and timestamps for further research.A protocol-for-pushbutton mechanism of segmentation and annotation has been implemented in the ASK software, which will be introduced in Section 2.3. Moreover, the baseline ASK software also provides the functionalities of digital signal processing, feature extraction, modeling, training, and recognition by applying our in-house developed HMM-based decoder BioKIT (Telaar et al., 2014).The task of segmentation in HAR research is to split a relatively long sequence of activities into several segments of single activity, while annotation is the process of labeling each segment, such as "walk", "run", "stand-to-sit", among others. Segmentation, which can be performed manually (Rebelo et al., 2013), semi-supervised Barbič et al. (2004), or automatically (Guenterberg et al., 2009) (Micucci et al., 2017), is undoubtedly a prerequisite for annotation, and its output will be input for digital signal processing and feature extraction. Annotation, which can be performed directly after each segmentation subtask, helps two follow-up operations: training and evaluation.In our research, we applied the pushbutton of the biosignalsplux Research Kit in our proposed semiautomated segmentation and annotation solution. In subsequent research, the applicability of the semiautomated segmented data has been verified for our research purpose during numerous experiments (see Section 4), so we have been applying this mechanism to our successively acquired datasets.The so-called protocol-for-pushbutton mechanism of segmentation and annotation has been implemented in the ASK software (Liu and Schultz, 2018). When the "segmentation and annotation" mode is switched on during the data acquisition, a predefined activity sequence protocol will be loaded into the software, which prompts the user to perform the activities one after the other. Each activity is displayed on the screen one-by-one while the user controls the activity recording by pushing, holding, and releasing the pushbutton. The user follows the instructions of the software step-by-step. For example, the prompted activity states "walk," the user sees the instruction "Please hold the pushbutton and do: walk." The user prepares for it, then pushes the button and starts to "walk". She/he keeps holding the pushbutton while walking for a duration at will, then releases the pushbutton to finish this activity. With the release, the system displays the next activity instruction, e.g., "stand-to-sit", the process continues until the predefined acquisition protocol is fully processed.The ASK software records all timestamps/sample numbers of each button push and button release during the data recording. These data are archived in CSV files as segmentation and annotation results for each activity. Since we synchronized all data at 1000 Hz, each sample represents data from 1 millisecond. For example, a line "sit, 3647, 6163" in a CSV file means that the activity segment labeled "sit" lasts 2,516 samples from the timestamp 3647 to 6162, which corresponds to 2.516 seconds. The corresponding 2,516 samples form one segment for training the activity model "sit," or for the recognition results evaluation.The protocol-for-pushbutton mechanism was implemented to reduce the time and labor costs of manual annotation. The resulting segmentations are excellent and required little to no manual correction, and lay a good foundation for subsequent research. Nevertheless, this mechanism has some limitations:• The mechanism can only be applied during acquisition and is incapable of segmenting archived data;• Clear activity start-/endpoints need to be defined, which is impossible in cases like field studies;• Activities requiring both hands are not possible due to participants holding the pushbutton; • The pushbutton operation may consciously or subconsciously affect the activity execution;• The participant forgetting to push or release the button results in subsequent segmentation errors;None of these limitations, except forgetting to release the pushbutton, hold in a laboratory setting with clear instructions and protocols. Hence, misapplication of the pushbutton applied to the collection of the CSL-SHARE data and was addressed by real-time human monitoring of the incoming sensor signals, including the pushbutton during acquisition. Additionally, a mobile phone video camera for post verification and adjustments was used (see Section 2.4).Although the "segmentation and annotation" mode of the ASK software was switched on to segment and annotate the recorded data efficiently, a mobile phone video camera was used in addition to record the whole biosignal acquisition sessions to manually correct the misoperation of pushing/holding/releasing after the data recording.After each recording event with one subject, the collected data and the automatically generated segments with annotation labels were examined thoroughly based on the video. Segments with minor human-factor errors were corrected by shifting the start-/endpoint forward/backward a short distance manually, while segments with problems that cannot be easily corrected were discarded, which is one of the reasons leading to the slight divergence among the activity occurrences in Table 1. A script to automatically detect the activity length outlier was also implemented to assist the segmentation verification. After finishing the correction and verification, we deleted all recorded videos to preserve privacy.The CSL-SHARE dataset was recorded in a controlled laboratory environment at the Cognitive Systems Lab, University of Bremen, comprised of 22 daily living and sports-based activities. The acquisition protocols of CSL-SHARE recording events were strictly and normatively designed. The body steering angles and the number of steps related to the activity parameters are restricted. Most of the acquisition protocols contain only one activity. However, there are two protocols with two activities and one protocol with four activities because these activities can be practically and logically recorded one after another in a sequence, which also keeps the balance of the activity occurrences. To follow the logical sequence of the activities and the protocol-for-pushbutton segmentation and annotation mechanism (see Section 2.3), the order of the activities in these three multi-activity protocols must be observed during recording. Figure 1 illustrates the diagrammatic sketch of all recording protocols, helping more intuitively understand the recording procedure and activity details. The 22 activities and the 17 acquisition protocols are described as follows:(Protocols 1-17)The number of repetitions/to-record activities per each protocol is a pre-designed plan.In the post verification (see Section), a few non-conformity and erroneous segments were removed.The meaning of most of the activities in the CSL-SHARE dataset can be self-explanatory from their names or the description in the protocols.The ``Spin-left''/``spin-right'' activity can be understood as the ``Left face!'' or ``Right face!'' action in the army (but in daily situations, not so stressful as in military training).The ``V-Cut'' activity is a step in which the body rotation (instead of the directional change) takes place.Some activities in the CSL-SHARE dataset are the subdivision of original activities.For example, ``spin-left'' is divided into ``spin-left-left-first'' and ``spin-left-right-first,'' denoting which foot should be moved first.Similarly, ``spin-right,'' ``V-cut-left,'' and ``V-cut-right'' are also divided into two activities in regard to the first-moved foot.The activities mentioned above are subdivided because they only involve one gait, and we only use the sensors placed on the right-leg-worn bandage.Therefore, the ``left foot first'' and ``right foot first'' of these activities will lead to very different signal patterns.On the contrary, for activities involving multiple (three) steps/gait cycles, such as ``walk,'' ``walk-curve''s, ``walk-stair''s, ``run,'' and ``shuffle``s, we did not further subdivide them.Instead, we restricted in the protocols the number of gait for each segment of these activities to three and defined the left foot as the start.SubjectsTwenty subjects without any gait impairments, five female and fifteen male, aged between 23 and 43 (30.5 $\pm$ 5.8), participated in the data collection events, among which one subject had knee inflammation and could not perform certain activities.Each subject's participation time is approximately two hours, including announcement and precautions, questions and answers, equipment wearing and adjusting, software preparation and test-running, acquisition following all protocols, taking breaks, and equipment release.Privacy Preservation and Data SecurityAll subjects signed a written informed consent form, and the study was conducted in accordance with Helsinki's WMA (World Medical Association) Declaration \citep{wma13}.According to the consent form, we only kept the wearable sensor data pseudonymized and did not leave any identification information of the participants.The to-share CSL-SHARE dataset is available in an anonymized form.As mentioned in Section, we used videos to verify the segmentation and annotation, and all videos have been deleted after the post verification to protect privacy.In addition, the consent form stipulates that the use of the data is limited to non-commercial research purposes, and the data users guarantee not to attempt to identify the participating persons.Furthermore, the data users guarantee to pass on the data (or data derived from it) only to third parties who are bound by the same rules of use (for non-commercial research purposes, no identification attempts, restricted disclosure).Data users who violate the usage regulation mentioned above will bear the legal consequences themselves, where the dataset publisher takes no responsibility.Data FormatWe provide our CSL-SHARE dataset in an anonymized form in the following directory structure and file format: The root directory contains a total of 20 sub-directories with order numbers 1-20, representing the data of 20 subjects. In each subdirectory, there are 34 files. The seventeen .H5 files, named by the order number of protocols, use HDF5 format to save the raw recorded data of seventeen protocols, while the seventeen .CSV files are the corresponding annotation results.Each row in the .H5 files is according to the following sensor order: EMG 1, EMG 2, EMG 3, EMG 4, airborne microphone, accelerometer upper X, accelerometer upper Y, accelerometer upper Z, electrogoniometer X, accelerometer lower X, accelerometer lower Y, accelerometer lower Z, electrogoniometer Y, gyroscope upper X, gyroscope upper Y, gyroscope upper Z, gyroscope lower X, gyroscope lower Y, gyroscope lower ZThere are three sub-directories/sub-datasets with exceptions:\begin{itemize} \item Sub-directory 2: The 02.CSV and 05.CSV files are different from protocols 2 and 5: the labels are mixed with each other. Subject 2 performed wrong angles when turning the body between activities. We were not aware of it during the data collection process, and the problem was first discovered through the video in the post-verification. However, this mixture affects neither the integrity of the dataset nor the number of times each activity should occur; \item Sub-directory 11: Protocol 13 is divided into two parts due to the device communication breaking; \item Sub-directory 16: Not all activities were performed due to the subject's knee inflammation, which is one of the reasons leading to the slight divergence among the activity occurrences in Table \ref{tab:1} (see Section \ref{sec:postveri} for another reason).\end{itemize}Statistical AnalysisThe 22-activity CSL-SHARE dataset contains 11.52 hours of data (of which 6.03 hours have been segmented and annotated) from 20 subjects, 5 female and 15 male.Table \ref{tab:1} gives the number of activity segments, the total effective length over all segments, and the minimal/maximal/mean length of the 22 activities.By analyzing the duration distribution of each activity of all subjects in histograms, we find that all activities' duration over all segments approximately accords with the normal distribution.The distribution of the activities ``sit'' and ``stand'' deviates slightly, as they can last arbitrarily long.ConclusionWe share our in-house collected dataset CSL-SHARE (Cognitive Systems Lab Sensor-based Human Activity REcordings) in this article and introduce its recording procedure and technical details. This 19-channel 22-activity 20-subject dataset applies two triaxial accelerometers, two triaxial gyroscopes, four EMG sensors, one electrogoniometer, and one airborne microphone with sampling rates up to 1000Hz and uses a knee bandage as a novel wearable sensor carrier. Six-hour data of a totally 11.52-hour recording are well segmented, annotated, and post-verified.The reliability and applicability of the CSL-SHARE dataset and its previous pilot data collection can be observed through literature in various research aspects here and there, such as HAR research pipeline \citep{liu18}, real-time end-to-end HAR system \citep{liu19}, visualized verification of multimodal feature extraction \citep{Barandas20}, feature space dimensionality study \citep{hartmann20} \citep{hartmann21}, human activity modeling \citep{liu21}, among others.Standing on the dataset robustness, we publish the CSL-SHARE dataset as an open-source sensor-based biosignals dataset for HAR, hoping to contribute research materials to the researchers in the same or similar fields.
Loading