AMMAI: Deep neural networks for acoustic modeling in speech recognition

Introduction:
In speech recognition, GMM-HMM is strong before. But GMM has a big problem. If most points lie on the surface of a sphere, we can use little parameters to model that. But in GMM, it requires very large number of parameters. Now researchers replace GMM with DNN, and they have shown that DNN outperforms GMM.

Method:
Restricted Boltzmann machine (RBM)

1. First a GRBM is trained to model a window of frames of real-valued acoustic coefficients.

2. Then the states of the binary hidden units of the GRBM are used as data for training an RBM. This is repeated to create as many hidden layers as desired.

3. Then the stack of RBMs is converted to a single generative model, a DBN, by replacing the undirected connections of the lower level RBMs by top-down, directed connections.

4.Finally, a pre-trained DBN-DNN is created by adding a “softmax” output layer that contains one unit for each possible state of each HMM.

Experiment:

AMMAI

2016年6月21日星期二

Deep neural networks for acoustic modeling in speech recognition

沒有留言:

張貼留言

2016年6月21日 星期二

Deep neural networks for acoustic modeling in speech recognition

沒有留言:

張貼留言

2016年6月21日星期二