Abstract: Highlights•STMENet scheme is proposed to address infrared dim-small target detection.•A 2D–3D dual-branch extractor is designed to extract infrared target features.•Spatio-temporal Mix Encoder is designed to further enhance feature representations.