LSTM
Learning Target
cross entropy
Learning
Backpropagation through time (BPTT)
- RNN - based network is not always easy to learn
The error surface is rough
the error surface is eigher very flat or very steep.
clipping : 当gradient > 15, 另gradient = 15;
问题产生的原因
产生原因—同样的参数反复使用,w一旦产生影响,会很大
LSTM – can deal with gradient vanishing (not gradient explode) 通常设置小一点
处理操作不一样:
- RNN 洗掉 memory
- LSTM memory and input are added; the influence never disappears unless forget gate is closed.
GRU: 旧的不去新的不来;
尝试解决 Gradient vanishing 的 network:
- clockwise RNN
- Structurally Constrained Recurrent Network(SCRN)
LSTM应用
many to one
-
sentiment analysis
-
key term extraction
many to many
- speech recognition
but the output shorter
trimming 去重: 无法辨识 好棒 和 好棒棒
CTC: connectionnist Temporal Classification
training:穷举所有状况
- sequence to sequence learning
add a symbol ‘===’ 断
-
Syntactic parsing 文法结构树
-
auto-encoder
-
- text
- speech
-
语音搜寻;
auto-encoder
chat-bot
Attention-based model
Reading comprehension
visual Question Answering
Speech Question Answering
Deep learning 和 structure learning 间的关系
集成二者一起学习
is structured learning practical