Keywords: Synthetics, Dataset, Health, Physiology, Computer Vision
TL;DR: SCAMPS is a dataset of high-fidelity synthetics containing 2,800 videos (1.68M frames) of avatars with aligned cardiac and respiratory signals and facial action intensities.
Abstract: The use of cameras and computational algorithms for noninvasive, low-cost and scalable measurement of physiological (e.g., cardiac and pulmonary) vital signs is very attractive. However, diverse data representing a range of environments, body motions, illumination conditions and physiological states is laborious, time consuming and expensive to obtain. Synthetic data have proven a valuable tool in several areas of machine learning, yet are not widely available for camera measurement of physiological states. Synthetic data offer "perfect" labels (e.g., without noise and with precise synchronization), labels that may not be possible to obtain otherwise (e.g., precise pixel level segmentation maps) and provide a high degree of control over variation and diversity in the dataset. We present SCAMPS, a dataset of synthetics containing 2,800 videos (1.68M frames) with aligned cardiac and respiratory signals and facial action intensities. The RGB frames are provided alongside segmentation maps and precise descriptive statistics about the underlying waveforms, including inter-beat interval, heart rate variability, and pulse arrival time. Finally, we present baseline results training on these synthetic data and testing on real-world datasets to illustrate generalizability.
Supplementary Material: pdf
Dataset Url: https://github.com/danmcduff/scampsdataset
Dataset Embargo: There is no dataset embargo.
Author Statement: Yes
License: Research Use of Data Agreement v1.0 This is the Research Use of Data Agreement, Version 1.0 (the “R-UDA”). Capitalized terms are defined in Section 5. Data Provider and you agree as follows: 1. Provision of the Data 1.1. You may use, modify, and distribute the Data made available to you by the Data Provider under this R-UDA for Research Use if you follow the R-UDA’s terms. 1.2. Data Provider will not sue you or any Downstream Recipient for any claim arising out of the use, modification, or distribution of the Data provided you meet the terms of the R-UDA. 1.3. This R-UDA does not restrict your use, modification, or distribution of any portions of the Data that are in the public domain or that may be used, modified, or distributed under any other legal exception or limitation. 2. Restrictions 2.1. You agree that you will use the Data solely for Computational Use for non-commercial research. This restriction means that you may engage in non-commercial research activities (including non-commercial research undertaken by or funded via a commercial entity), but you may not use the Data or any Results in any commercial offering, including as part of a product or service (or to improve any product or service) you use or provide to others. 2.2. You may not receive money or other consideration in exchange for use or redistribution of Data. 3. Redistribution of Data 3.1. You may redistribute the Data, so long as: 3.1.1. You include with any Data you redistribute all credit or attribution information that you received with the Data, and your terms require any Downstream Recipient to do the same; and 3.1.2. You bind each recipient to whom you redistribute the Data to the terms of the R-UDA. 4. No Warranty, Limitation of Liability 4.1. Data Provider does not represent or warrant that it has any rights whatsoever in the Data. 4.2. THE DATA IS PROVIDED ON AN “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OR CONDITIONS OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 4.3. NEITHER DATA PROVIDER NOR ANY UPSTREAM DATA PROVIDER SHALL HAVE ANY LIABILITY FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING WITHOUT LIMITATION LOST PROFITS), HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE DATA OR RESULTS, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. 5. Definitions 5.1. “Computational Use” means activities necessary to enable the use of Data (alone or along with other material) for analysis by a computer. 5.2. “Data” means the material you receive under the R-UDA in modified or unmodified form, but not including Results. 5.3. “Data Provider” means the source from which you receive the Data and with whom you enter into the R-UDA. 5.4. “Downstream Recipient” means any person or persons who receives the Data directly or indirectly from you in accordance with the R-UDA. 5.5. “Result” means anything that you develop or improve from your use of Data that does not include more than a de minimis portion of the Data on which the use is based. Results may include de minimis portions of the Data necessary to report on or explain use that has been conducted with the Data, such as figures in scientific papers, but do not include more. Artificial intelligence models trained on Data (and which do not include more than a de minimis portion of Data) are Results. 5.6. “Upstream Data Providers” means the source or sources from which the Data Provider directly or indirectly received, under the terms of the R-UDA, material that is included in the Data.
Contribution Process Agreement: Yes
In Person Attendance: Yes
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/arxiv:2206.04197/code)