Informacionnye Tehnologii, 2019, vol. 25, no. 5, pp. 313-318

Ðóññêèé

ABSTRACTS OF ARTICLES OF THE JOURNAL "INFORMATION TECHNOLOGIES".
No. 5. Vol. 25. 2019

DOI: 10.17587/it.25.313-318

L. V. Savchenko, Ph. D., e-mail: lsavchenko@hse.ru, National Research University Higher School of Economics — N. Novgorod

Computer-Assisted Language Learning Based on Convolutional Neural Networks and Information Theory of Speech Perception

In this paper we consider a problem of computer assisted language and pronunciation learning based on the deep neural networks and the information theory of speech perception. At first, a user learns the stable pronunciation of words. The best utterances from the user with high posterior probability estimated by the pre-trained convolutional neural network are added to the training set. Next, this training set is used to fine-tune this convolutional neural network. If new utterances are successfully recognized with the resulted neural network, it is concluded that pronunciation of all words is distinguishable. In this case in order to additionally verify the stability of pronunciation of each class (word), the closeness of the user pronunciations is estimated by computing the average Kullback-Leibler information discrimination between each signal and the centroid reference of the class. If this mean discrimination for particular word is greater than a certain threshold, then the training for this word should be repeated. The experimental results for learning of English words proved that the proposed approach is characterized by higher accuracy and speed for existing acoustic models when compared to conventional techniques.
Keywords: computer-assisted learning system, speech recognition, convolutional neural network, deep learning, Kullback-Leibler information discrimination

P. 313–31

To the contents