Evaluation Model of College English Multimedia Teaching Effect Based on Deep Convolutional Neural Networks

Mobile Information Systems ◽

10.1155/2021/1874584 ◽

2021 ◽

Vol 2021 ◽

pp. 1-8

Author(s):

Limei Geng

Keyword(s):

Neural Network ◽

Neural Networks ◽

Speech Recognition ◽

Language Learning ◽

Convolutional Neural Network ◽

English Learners ◽

Speech Signal ◽

English Learning ◽

College English ◽

Multimedia Teaching

With the acceleration of global integration, the demand for English instruction is increasingly rising. On the other hand, Chinese English learners struggle to learn spoken English due to the limited English learning environment and teaching conditions in China. The advancement of artificial intelligence technology and the advancement of language teaching and learning techniques have ushered in a new era of language learning and teaching. Deep learning technology makes it possible to solve this problem. Speech recognition and assessment technology are at the heart of language learning, and speech recognition technology is the foundation. Because of the complex changes in speech pronunciation, a large amount of speech signal data, the high dimension of speech characteristic parameters, and a large amount of speech recognition and evaluation computation, the large volume of speech signal processing requires higher requirements of hardware and software resources and algorithms. However, traditional speech recognition algorithms, such as dynamic time-warped algorithms, hidden Markov models, and artificial neural networks, have their advantages and disadvantages. They have encountered unprecedented bottlenecks, so it is difficult to improve their accuracy and speed. To solve these problems, this paper focuses on evaluating the multimedia teaching effect of college English. A multilevel residual convolutional neural network algorithm for oral English pronunciation recognition is proposed based on a deep convolutional neural network. The experiments show that our algorithm can assist learners in identifying inconsistencies between their pronunciation and standard pronunciation and correcting pronunciation errors, resulting in improved oral English learning performance.

Download Full-text

Advanced Convolutional Neural Network-Based Hybrid Acoustic Models for Low-Resource Speech Recognition

Computers ◽

10.3390/computers9020036 ◽

2020 ◽

Vol 9 (2) ◽

pp. 36

Author(s):

Tessfu Geteye Fantaye ◽

Junqing Yu ◽

Tulu Tilahun Hailu

Keyword(s):

Neural Network ◽

Neural Networks ◽

Speech Recognition ◽

Convolutional Neural Network ◽

Network Models ◽

Neural Network Models ◽

Feed Forward ◽

Acoustic Models ◽

Low Resource

Deep neural networks (DNNs) have shown a great achievement in acoustic modeling for speech recognition task. Of these networks, convolutional neural network (CNN) is an effective network for representing the local properties of the speech formants. However, CNN is not suitable for modeling the long-term context dependencies between speech signal frames. Recently, the recurrent neural networks (RNNs) have shown great abilities for modeling long-term context dependencies. However, the performance of RNNs is not good for low-resource speech recognition tasks, and is even worse than the conventional feed-forward neural networks. Moreover, these networks often overfit severely on the training corpus in the low-resource speech recognition tasks. This paper presents the results of our contributions to combine CNN and conventional RNN with gate, highway, and residual networks to reduce the above problems. The optimal neural network structures and training strategies for the proposed neural network models are explored. Experiments were conducted on the Amharic and Chaha datasets, as well as on the limited language packages (10-h) of the benchmark datasets released under the Intelligence Advanced Research Projects Activity (IARPA) Babel Program. The proposed neural network models achieve 0.1–42.79% relative performance improvements over their corresponding feed-forward DNN, CNN, bidirectional RNN (BRNN), or bidirectional gated recurrent unit (BGRU) baselines across six language collections. These approaches are promising candidates for developing better performance acoustic models for low-resource speech recognition tasks.

Download Full-text

Decentralizing Feature Extraction with Quantum Convolutional Neural Network for Automatic Speech Recognition

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp39728.2021.9413453 ◽

2021 ◽

Author(s):

Chao-Han Huck Yang ◽

Jun Qi ◽

Samuel Yen-Chi Chen ◽

Pin-Yu Chen ◽

Sabato Marco Siniscalchi ◽

...

Keyword(s):

Neural Network ◽

Feature Extraction ◽

Speech Recognition ◽

Convolutional Neural Network ◽

Automatic Speech Recognition

Download Full-text

A Light-weight Convolutional Neural Network based Speech Recognition for Spoken Content Retrieval Task

2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC) ◽

10.1109/smc42975.2020.9282956 ◽

2020 ◽

Author(s):

Nirayo Hailu Gebreegziabher ◽

Andreas Nurnberger

Keyword(s):

Neural Network ◽

Speech Recognition ◽

Convolutional Neural Network ◽

Light Weight ◽

Retrieval Task ◽

Content Retrieval

Download Full-text

A High Accuracy Multiple-Command Speech Recognition ASIC Based on Configurable One-Dimension Convolutional Neural Network

2021 IEEE International Symposium on Circuits and Systems (ISCAS) ◽

10.1109/iscas51556.2021.9401401 ◽

2021 ◽

Author(s):

Lindong Wu ◽

Zongwei Wang ◽

Ming Zhao ◽

Wei Hu ◽

Yimao Cai ◽

...

Keyword(s):

Neural Network ◽

Speech Recognition ◽

Convolutional Neural Network ◽

High Accuracy ◽

One Dimension

Download Full-text

Classification of Skin Disease Using Deep Learning Neural Networks with MobileNet V2 and LSTM

Sensors ◽

10.3390/s21082852 ◽

2021 ◽

Vol 21 (8) ◽

pp. 2852

Author(s):

Parvathaneni Naga Srinivasu ◽

Jalluri Gnana SivaSai ◽

Muhammad Fazal Ijaz ◽

Akash Kumar Bhoi ◽

Wonjoon Kim ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Learning ◽

Convolutional Neural Network ◽

Skin Disease ◽

Network Architecture ◽

Large Scale ◽

Short Term Memory ◽

Convolutional Networks ◽

Occurrence Matrix

Deep learning models are efficient in learning the features that assist in understanding complex patterns precisely. This study proposed a computerized process of classifying skin disease through deep learning based MobileNet V2 and Long Short Term Memory (LSTM). The MobileNet V2 model proved to be efficient with a better accuracy that can work on lightweight computational devices. The proposed model is efficient in maintaining stateful information for precise predictions. A grey-level co-occurrence matrix is used for assessing the progress of diseased growth. The performance has been compared against other state-of-the-art models such as Fine-Tuned Neural Networks (FTNN), Convolutional Neural Network (CNN), Very Deep Convolutional Networks for Large-Scale Image Recognition developed by Visual Geometry Group (VGG), and convolutional neural network architecture that expanded with few changes. The HAM10000 dataset is used and the proposed method has outperformed other methods with more than 85% accuracy. Its robustness in recognizing the affected region much faster with almost 2× lesser computations than the conventional MobileNet model results in minimal computational efforts. Furthermore, a mobile application is designed for instant and proper action. It helps the patient and dermatologists identify the type of disease from the affected region’s image at the initial stage of the skin disease. These findings suggest that the proposed system can help general practitioners efficiently and effectively diagnose skin conditions, thereby reducing further complications and morbidity.

Download Full-text

Isolated Word Speech Recognition Using Convolutional Neural Network

2020 International Conference on Computer, Control, Electrical, and Electronics Engineering (ICCCEEE) ◽

10.1109/iccceee49695.2021.9429684 ◽

2021 ◽

Author(s):

Aljenan Soliman ◽

Salah Mohamed ◽

Iman Abuelmaaly Abdelrahman

Keyword(s):

Neural Network ◽

Speech Recognition ◽

Convolutional Neural Network ◽

Isolated Word

Download Full-text

Programmable phase-change metasurfaces on waveguides for multimode photonic convolutional neural network

Nature Communications ◽

10.1038/s41467-020-20365-z ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Changming Wu ◽

Heshan Yu ◽

Seokhyeong Lee ◽

Ruoming Peng ◽

Ichiro Takeuchi ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Phase Change ◽

Convolutional Neural Network ◽

Large Scale ◽

Phase Change Materials ◽

Refractive Index Change ◽

Optical Computing ◽

Machine Learning Algorithms ◽

Matrix Vector Multiplication

AbstractNeuromorphic photonics has recently emerged as a promising hardware accelerator, with significant potential speed and energy advantages over digital electronics for machine learning algorithms, such as neural networks of various types. Integrated photonic networks are particularly powerful in performing analog computing of matrix-vector multiplication (MVM) as they afford unparalleled speed and bandwidth density for data transmission. Incorporating nonvolatile phase-change materials in integrated photonic devices enables indispensable programming and in-memory computing capabilities for on-chip optical computing. Here, we demonstrate a multimode photonic computing core consisting of an array of programable mode converters based on on-waveguide metasurfaces made of phase-change materials. The programmable converters utilize the refractive index change of the phase-change material Ge2Sb2Te5 during phase transition to control the waveguide spatial modes with a very high precision of up to 64 levels in modal contrast. This contrast is used to represent the matrix elements, with 6-bit resolution and both positive and negative values, to perform MVM computation in neural network algorithms. We demonstrate a prototypical optical convolutional neural network that can perform image processing and recognition tasks with high accuracy. With a broad operation bandwidth and a compact device footprint, the demonstrated multimode photonic core is promising toward large-scale photonic neural networks with ultrahigh computation throughputs.

Download Full-text

A DFC taxonomy of Speech emotion recognition based on convolutional neural network from speech signal

2020 5th International Conference on Innovative Technologies in Intelligent Systems and Industrial Applications (CITISIA) ◽

10.1109/citisia50690.2020.9371841 ◽

2020 ◽

Author(s):

Surendra Malla ◽

Abeer Alsadoon ◽

Simi Kamini Bajaj

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Emotion Recognition ◽

Speech Signal ◽

Speech Emotion Recognition

Download Full-text

College English Learning Center Planning Based on Language Learning

Tobacco Regulatory Science ◽

10.18001/trs.7.5.2.15 ◽

2021 ◽

Vol 7 (5) ◽

pp. 4493-4499

Author(s):

Diwen Dong

Keyword(s):

Language Learning ◽

Independent Learning ◽

Learning Needs ◽

English Learning ◽

Learning Center ◽

Learning Centers ◽

Learning Materials ◽

College English ◽

The University ◽

Functional Areas

Objectives: Planning for English learning centers for college students can meet the needs of students’ independent learning and achieve the purpose of enhancing students’ comprehensive English practice and application ability. Methods: This study proposed the characteristics and functions of the English learning center, as well as the resources and facilities of the learning center when planning the university English learning center, and explained the construction of the English learning center’s learning materials and the division of functional areas. The influencing factors of the construction of learning center materials mainly include students’ language level, learning needs, authority and applicability of learning materials. Results: On this basis, taking the English learning center plan of a university library as an example, the functional areas are divided into four functional areas: English listening, speaking, reading and writing. Conclusion: It is hoped that this research will provide some reference and reference for the planning study of university English learning center based on language learning.

Download Full-text

EMOTIONS RECOGNITION IN HUMAN SPEECH USING DEEP NEURAL NETWORKS

Vestnik komp iuternykh i informatsionnykh tekhnologii ◽

10.14489/vkit.2021.01.pp.044-051 ◽

2021 ◽

pp. 44-51

Author(s):

E. Yu. Shchetinin

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Convolutional Neural Network ◽

Recurrent Neural Network ◽

Deep Neural Networks ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Audio Recordings ◽

Computer Studies

The recognition of human emotions is one of the most relevant and dynamically developing areas of modern speech technologies, and the recognition of emotions in speech (RER) is the most demanded part of them. In this paper, we propose a computer model of emotion recognition based on an ensemble of bidirectional recurrent neural network with LSTM memory cell and deep convolutional neural network ResNet18. In this paper, computer studies of the RAVDESS database containing emotional speech of a person are carried out. RAVDESS-a data set containing 7356 files. Entries contain the following emotions: 0 – neutral, 1 – calm, 2 – happiness, 3 – sadness, 4 – anger, 5 – fear, 6 – disgust, 7 – surprise. In total, the database contains 16 classes (8 emotions divided into male and female) for a total of 1440 samples (speech only). To train machine learning algorithms and deep neural networks to recognize emotions, existing audio recordings must be pre-processed in such a way as to extract the main characteristic features of certain emotions. This was done using Mel-frequency cepstral coefficients, chroma coefficients, as well as the characteristics of the frequency spectrum of audio recordings. In this paper, computer studies of various models of neural networks for emotion recognition are carried out on the example of the data described above. In addition, machine learning algorithms were used for comparative analysis. Thus, the following models were trained during the experiments: logistic regression (LR), classifier based on the support vector machine (SVM), decision tree (DT), random forest (RF), gradient boosting over trees – XGBoost, convolutional neural network CNN, recurrent neural network RNN (ResNet18), as well as an ensemble of convolutional and recurrent networks Stacked CNN-RNN. The results show that neural networks showed much higher accuracy in recognizing and classifying emotions than the machine learning algorithms used. Of the three neural network models presented, the CNN + BLSTM ensemble showed higher accuracy.

Download Full-text