In the paper titled 'Topically Driven Neural Language Model', it adopts a max-over-time pooling technique in Equations, such method is inspired from and proposed by another paper that you have read. What is the full name of that paper? 