Abstract: Highlights•Present that incomplete interactions limit rationality for token distribution.•Design a quadratic E-D mode model CLVIN to realize reasonable token distribution.•Propose CLVIN-c to implement further improvements in model size and performance.•Realize significant or comparable performance gain compared to some existing SOTAs.
Loading