chapter8-神经网络和自然语言模型
- Embeddings
这部分简单的看一下吧~毕竟已经很熟了。。。
chapter 8 - neural networks
chapter 15 - semantic representations for words called embeddings
chapter 25 - sequence-to-sequence (seq2seq model), applied to language generation: machine translation, conversation agents and summarization.
Neural Language Models
相比前面chapter4介绍的paradigm(the smoothed N-grams),依据神经网络的语言模型有很多优势~
不需要smoothing?
能够依赖更长的历史依据 handle much longer histories
泛化能力更强
除此之外,还是生成模型的基础~
Embeddings
词向量: 为什么要用向量表示词?
Vectors turn out to be a really powerful representation for words, because a distributed representation allows words that have similar meanings, or similar grammatical properties, to have similar vectors.
相近意思和相似语法性质的词,具有相似的向量。
用神经网络模型训练词向量,这里是用的4-gram,生成下一个词取决于前三个词:
input layer: 1x|V|
embedding vector matrix: dx|V|
projection layer: 1x3d
hidden layer: W.shape (dhx3d) -> 1xdh
output layer: U.shape (|V|xdh) -> 1x|V|
softmax : $P(w_t=i|w_{t-1},w_{t-2},w_{t-3})$
chapter8-神经网络和自然语言模型