Abstract: Highlights•The first tracker leverage multi-level visual attention.•Utilize LSTM to encode the video temporal information.•Make DNN-based tracking-by-detection more computationally feasible.•A promoting strategy for trackers with detection results.•Yield superior performance in two benchmark.
Loading