Abstract: Highlights•Propose a Transformer based multi-task model on dense prediction.•Propose an asymmetric attention based task interaction method with task guidance.•Design high-quality and low-cost upsampling method to avoid image detail loss.•Incorporate CNN into Transformer to model both local objects and global spatial relationships simultaneously.•Achieve optimal multi-task performance on public datasets NYUD-v2 and PASCAL Context.
Loading