Abstract: Highlights•The Mix-Tower model combines the strengths of both single-tower and dual-tower models.•The same model architecture is used to process different types of data.•A lightweight FFN module has been incorporated into the Transformer.•Compared to other baseline models, our model achieves superior performance.
Loading