Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
The Taylor expansion for dropout is divergent (Extended abstract)
Feb 12, 2018 (modified: Feb 12, 2018)ICLR 2018 Workshop Submissionreaders: everyone
Abstract:This extended abstract examines the assumptions of the derived equivalence between dropout noise injection and $L_2$ regularisation for logistic regression with negative log loss (Wager et al., 2013). We show that the approximation method is based on a divergent Taylor expansion. Hence, subsequent work using this approximation to compare the dropout trained logistic regression model with standard regularisers, remains unfortunately ill-founded to date.
TL;DR:We show that a well-known approximation of dropout relies on a divergent Taylor expansion and is therefore ill-founded.