Abstract: Self-driving research often underrepresents cyclist colli-
sions and safety. To address this, we present CycleCrash,
a novel dataset consisting of 3,000 dashcam videos with
436,347 frames that capture cyclists in a range of critical
situations, from collisions to safe interactions. This dataset
enables 9 different cyclist collision prediction and classifi-
cation tasks focusing on potentially hazardous conditions
for cyclists and is annotated with collision-related, cyclist-
related, and scene-related labels. Next, we propose Vid-
NeXt, a novel method that leverages a ConvNeXt spatial
encoder and a non-stationary transformer to capture the
temporal dynamics of videos for the tasks defined in our
dataset. To demonstrate the effectiveness of our method and
create additional baselines on CycleCrash, we apply and
compare 7 models along with a detailed ablation. We re-
lease the dataset and code at https://github.com/
DeSinister/CycleCrash/.
Loading