MIPT Deep Learning Club #13
Taras Khakhulin about “Breaking the Softmax Bottleneck: A High-Rank RNN Language Model”
The problem of constructing the Language model was considered as a factorization of matrix. Authors showed that Softmax has a bottleneck which affects the expressive power of the model. Also they proposed to solve such a problem using the Mixture of Softmax.
Results mentioned in the paper are impressive. A lot of experiments were conducted and state-of-the-art results were achieved on a big number of datasets.
To conclude, even very strong RNN Language model’s expressive power will be restricted because of a high rank representation of natural language.
Leave a comment