chapter8-神经网络和自然语言模型

  • Embeddings

这部分简单的看一下吧~毕竟已经很熟了。。。

  • chapter 8 - neural networks
  • chapter 15 - semantic representations for words called embeddings
  • chapter 25 - sequence-to-sequence (seq2seq model), applied to language generation: machine translation, conversation agents and summarization.

Neural Language Models

相比前面chapter4介绍的paradigm(the smoothed N-grams),依据神经网络的语言模型有很多优势~

  • 不需要smoothing?
  • 能够依赖更长的历史依据 handle much longer histories
  • 泛化能力更强
  • 除此之外,还是生成模型的基础~

Embeddings

词向量: 为什么要用向量表示词?

Vectors turn out to be a really powerful representation for words, because a distributed representation allows words that have similar meanings, or similar grammatical properties, to have similar vectors.

相近意思和相似语法性质的词,具有相似的向量。

用神经网络模型训练词向量,这里是用的4-gram,生成下一个词取决于前三个词:

  • input layer: 1x|V|
  • embedding vector matrix: dx|V|
  • projection layer: 1x3d
  • hidden layer: W.shape (dhx3d) -> 1xdh
  • output layer: U.shape (|V|xdh) -> 1x|V|
  • softmax : \(P(w_t=i|w_{t-1},w_{t-2},w_{t-3})\)