Abstract: This paper presents a novel approach for joint point-
feature detection and tracking, designed specifically for
Pixel Processor Array (PPA) vision sensors. Instead of
standard pixels, PPA sensors consist of thousands of “pixel-
processors”, enabling massive parallel computation of vi-
sual data at the point of light capture. Our approach per-
forms all computation entirely in-pixel, meaning no raw im-
age data need ever leave the sensor for external processing.
We introduce a Descriptor-In-Pixel paradigm, in which a
feature descriptor is held within the memory of each pixel-
processor. The PPA’s architecture enables the response of
every processor’s descriptor, upon the current image, to be
computed in parallel. This produces a“descriptor response
map” which, by generating the correct layout of descrip-
tors across the pixel-processors, can be used for both point-
feature detection and tracking. This reduces sensor output
to just sparse feature locations and descriptors, read-out
via an address-event interface, giving a greater than 1000×
reduction in data transfer compared to raw image output.
The sparse readout and complete utilization of all pixel-
processors makes our approach very efficient. Our imple-
mentation upon the SCAMP-7 PPA prototype runs at over
3000 FPS (Frames Per Second), tracking point-features re-
liably under violent motion. This is the first work perform-
ing point-feature detection and tracking entirely in-pixel.
Loading