Keywords: Large Language Model, Mixture-of-Experts, Routing Mechanism
Abstract: Mixture-of-Experts (MoE) has been a prevalent method for scaling up large language models at a reduced computational cost. Despite its effectiveness, the routing mechanism of MoEs still lacks a clear understanding from the perspective of cross-layer mechanistic interpretability. We propose a light-weight methodology at which we can break down the routing decision for MoEs to contribution of model components, in a recursive fashion. We use our methodology to dissect the routing mechanism by decomposing the input of routers into model components. We study how different model components contribute to the routing in different widely used open models. Our findings on four different production models reveal common patterns such as: a) MoE layer outputs contribute more than attention layer outputs to the routing decisions of latter layers, b) \emph{MoEs entanglement} at which MoE firing up in layers consistently correlate with firing up of MoEs in latter layers, and c) some components can persistently influence the routing in many following layers. Our study includes also findings on how different models have different patterns when it comes to long range and short range inhibiting/promoting effects that components can have over MoEs in latter layers. Our results indicate importance of quantifying the impact of components across different layers on MoEs, and highlights the opportunities of using cross-layer contributions for effective model design and model serving.
Primary Area: interpretability and explainable AI
Submission Number: 16026
Loading