The Taylor expansion for dropout is divergent (Extended abstract)

Natalie Schluter

Feb 12, 2018 (modified: Feb 12, 2018) ICLR 2018 Workshop Submission readers: everyone
  • Abstract: This extended abstract examines the assumptions of the derived equivalence between dropout noise injection and $L_2$ regularisation for logistic regression with negative log loss (Wager et al., 2013). We show that the approximation method is based on a divergent Taylor expansion. Hence, subsequent work using this approximation to compare the dropout trained logistic regression model with standard regularisers, remains unfortunately ill-founded to date.
  • TL;DR: We show that a well-known approximation of dropout relies on a divergent Taylor expansion and is therefore ill-founded.
  • Keywords: dropout, regularisation, logistic regression