Abstract: This paper introduces a neuromorphic dataset and methodology for eye tracking, harnessing event data captured streamed continuously by a Dynamic Vision Sensor (DVS). The framework integrates a directly trained Spiking Neuron Network (SNN) regression model and leverages a state-of-the-art low power edge neuromorphic processor - Speck. First, it introduces a representative event-based eye-tracking dataset, "Ini-30," which was collected with two glass-mounted DVS cameras from thirty volunteers. Then, a SNN model, based on Integrate And Fire (IAF) neurons, named "Retina", is described , featuring only 64k parameters (6.63x fewer than 3ET) and achieving pupil tracking error of only 3.24 pixels in a 64x64 DVS input. The continuous regression output is obtained by means of temporal convolution using a non-spiking 1D filter slided across the output spiking layer over time. Retina is evaluated on the neuromorphic processor, showing an end-to-end power between 2.89-4.8 mW and a latency of 5.57-8.01 ms dependent on the time to slice the event-based video recording. The model is more precise than the latest event-based eye-tracking method, "3ET", on Ini-30, and shows comparable performance with significant model compression (35 times fewer MAC operations) in the synthetic dataset used in "3ET". We hope this work will open avenues for further investigation of close-loop neuromorphic solutions and true event-based training pursuing edge performance.
Loading