Abstract: Highlights•Grassmann subspaces extend spatial/channel attention for multi-modal fusion.•Low-rank mappings disentangle image semantics for multi-scale cross-modal learning.•Cross-modal strategy enhances foreground-background information separation.•Our manifold-based fusion approach shows superior benchmark performance.
External IDs:dblp:journals/inffus/KangLWXWCK26
Loading