Research on Music Classification Technology Based on Deep Learning

Security and Communication Networks ◽

10.1155/2021/7182143 ◽

2021 ◽

Vol 2021 ◽

pp. 1-8

Author(s):

Fang Zhang

Keyword(s):

Deep Learning ◽

Classification Accuracy ◽

Musical Instrument ◽

Digital Audio ◽

Digital Interface ◽

Classifier Design ◽

Music Classification ◽

Interdisciplinary Field ◽

Shallow Structure ◽

Mc Method

With the advent of the digital music era, digital audio sources have exploded. Music classification (MC) is the basis of managing massive music resources. In this paper, we propose a MC method based on deep learning to improve feature extraction and classifier design based on MIDI (musical instrument digital interface) MC task. Considering that the existing classification technology is limited by the shallow structure, it is difficult for the classifier to learn the time sequence and semantic information of music; this paper proposes a MIDIMC method based on deep learning. In the experiment, we use the MC method proposed in this paper to achieve 90.1% classification accuracy, which is better than the existing classification method based on BP neural network, and verify the music with its classification accuracy. The key point is that the music division method used in this paper has correct MC efficiency. However, due to the limited ability and time involved in the interdisciplinary field, the methodology of this paper has certain limitations, which still needs further research and improvement.

Download Full-text

Sequence-Based Explainable Hybrid Song Recommendation

Frontiers in Big Data ◽

10.3389/fdata.2021.693494 ◽

2021 ◽

Vol 4 ◽

Author(s):

Khalil Damak ◽

Olfa Nasraoui ◽

William Scott Sanders

Keyword(s):

Deep Learning ◽

Cold Start ◽

Musical Instrument ◽

Learning Sequence ◽

Digital Interface ◽

Top Performers ◽

Recommendation Accuracy ◽

Deep Learning Model ◽

Cold Start Problem ◽

Validation Experiments

Despite advances in deep learning methods for song recommendation, most existing methods do not take advantage of the sequential nature of song content. In addition, there is a lack of methods that can explain their predictions using the content of recommended songs and only a few approaches can handle the item cold start problem. In this work, we propose a hybrid deep learning model that uses collaborative filtering (CF) and deep learning sequence models on the Musical Instrument Digital Interface (MIDI) content of songs to provide accurate recommendations, while also being able to generate a relevant, personalized explanation for each recommended song. Compared to state-of-the-art methods, our validation experiments showed that in addition to generating explainable recommendations, our model stood out among the top performers in terms of recommendation accuracy and the ability to handle the item cold start problem. Moreover, validation shows that our personalized explanations capture properties that are in accordance with the user’s preferences.

Download Full-text

Improving Sentiment Analysis using Hybrid Deep Learning Model

Recent Advances in Computer Science and Communications ◽

10.2174/2213275912666190328200012 ◽

2020 ◽

Vol 13 (4) ◽

pp. 627-640 ◽

Cited By ~ 1

Author(s):

Avinash Chandra Pandey ◽

Dharmveer Singh Rajpoot

Keyword(s):

Neural Network ◽

Deep Learning ◽

Sentiment Analysis ◽

Classification Accuracy ◽

Short Term Memory ◽

Computational Cost ◽

Extraction Process ◽

Learning Model ◽

Sentiment Classification ◽

Deep Learning Model

Background: Sentiment analysis is a contextual mining of text which determines viewpoint of users with respect to some sentimental topics commonly present at social networking websites. Twitter is one of the social sites where people express their opinion about any topic in the form of tweets. These tweets can be examined using various sentiment classification methods to find the opinion of users. Traditional sentiment analysis methods use manually extracted features for opinion classification. The manual feature extraction process is a complicated task since it requires predefined sentiment lexicons. On the other hand, deep learning methods automatically extract relevant features from data hence; they provide better performance and richer representation competency than the traditional methods. Objective: The main aim of this paper is to enhance the sentiment classification accuracy and to reduce the computational cost. Method: To achieve the objective, a hybrid deep learning model, based on convolution neural network and bi-directional long-short term memory neural network has been introduced. Results: The proposed sentiment classification method achieves the highest accuracy for the most of the datasets. Further, from the statistical analysis efficacy of the proposed method has been validated. Conclusion: Sentiment classification accuracy can be improved by creating veracious hybrid models. Moreover, performance can also be enhanced by tuning the hyper parameters of deep leaning models.

Download Full-text

An Imbalanced Image Classification Method for the Cell Cycle Phase

Information ◽

10.3390/info12060249 ◽

2021 ◽

Vol 12 (6) ◽

pp. 249

Author(s):

Xin Jin ◽

Yuanwen Zou ◽

Zhongbing Huang

Keyword(s):

Cell Cycle ◽

Deep Learning ◽

Image Classification ◽

Classification Accuracy ◽

Data Augmentation ◽

Cycle Phase ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Cellular Life

The cell cycle is an important process in cellular life. In recent years, some image processing methods have been developed to determine the cell cycle stages of individual cells. However, in most of these methods, cells have to be segmented, and their features need to be extracted. During feature extraction, some important information may be lost, resulting in lower classification accuracy. Thus, we used a deep learning method to retain all cell features. In order to solve the problems surrounding insufficient numbers of original images and the imbalanced distribution of original images, we used the Wasserstein generative adversarial network-gradient penalty (WGAN-GP) for data augmentation. At the same time, a residual network (ResNet) was used for image classification. ResNet is one of the most used deep learning classification networks. The classification accuracy of cell cycle images was achieved more effectively with our method, reaching 83.88%. Compared with an accuracy of 79.40% in previous experiments, our accuracy increased by 4.48%. Another dataset was used to verify the effect of our model and, compared with the accuracy from previous results, our accuracy increased by 12.52%. The results showed that our new cell cycle image classification system based on WGAN-GP and ResNet is useful for the classification of imbalanced images. Moreover, our method could potentially solve the low classification accuracy in biomedical images caused by insufficient numbers of original images and the imbalanced distribution of original images.

Download Full-text

Deep Learning-Based Hepatocellular Carcinoma Histopathology Image Classification: Accuracy versus Training Dataset Size

IEEE Access ◽

10.1109/access.2021.3060765 ◽

2021 ◽

pp. 1-1

Author(s):

Yu-Shiang Lin ◽

Pei-Hsin Huang ◽

Yung-Yaw Chen

Keyword(s):

Hepatocellular Carcinoma ◽

Deep Learning ◽

Image Classification ◽

Classification Accuracy ◽

Training Dataset ◽

Dataset Size

Download Full-text

EEG Emotion Classification Using an Improved SincNet-Based Deep Learning Model

Brain Sciences ◽

10.3390/brainsci9110326 ◽

2019 ◽

Vol 9 (11) ◽

pp. 326 ◽

Cited By ~ 5

Author(s):

Hong Zeng ◽

Zhenhua Wu ◽

Jiaming Zhang ◽

Chen Yang ◽

Hua Zhang ◽

...

Keyword(s):

Deep Learning ◽

Speaker Recognition ◽

Classification Accuracy ◽

Deep Neural Network ◽

Signal To Noise Ratio ◽

Single Subject ◽

Emotion Classification ◽

Eeg Signals ◽

Electroencephalogram Eeg ◽

Deep Learning Model

Deep learning (DL) methods have been used increasingly widely, such as in the fields of speech and image recognition. However, how to design an appropriate DL model to accurately and efficiently classify electroencephalogram (EEG) signals is still a challenge, mainly because EEG signals are characterized by significant differences between two different subjects or vary over time within a single subject, non-stability, strong randomness, low signal-to-noise ratio. SincNet is an efficient classifier for speaker recognition, but it has some drawbacks in dealing with EEG signals classification. In this paper, we improve and propose a SincNet-based classifier, SincNet-R, which consists of three convolutional layers, and three deep neural network (DNN) layers. We then make use of SincNet-R to test the classification accuracy and robustness by emotional EEG signals. The comparable results with original SincNet model and other traditional classifiers such as CNN, LSTM and SVM, show that our proposed SincNet-R model has higher classification accuracy and better algorithm robustness.

Download Full-text

Hyperspectral Image Classification Based on Multi-Scale Residual Network with Attention Mechanism

Remote Sensing ◽

10.3390/rs13030335 ◽

2021 ◽

Vol 13 (3) ◽

pp. 335

Author(s):

Yuhao Qing ◽

Wenyi Liu

Keyword(s):

Neural Network ◽

Deep Learning ◽

Convolutional Neural Network ◽

Image Classification ◽

Classification Accuracy ◽

Hyperspectral Image ◽

Principal Component ◽

Hyperspectral Image Classification ◽

Deep Network ◽

Multi Scale

In recent years, image classification on hyperspectral imagery utilizing deep learning algorithms has attained good results. Thus, spurred by that finding and to further improve the deep learning classification accuracy, we propose a multi-scale residual convolutional neural network model fused with an efficient channel attention network (MRA-NET) that is appropriate for hyperspectral image classification. The suggested technique comprises a multi-staged architecture, where initially the spectral information of the hyperspectral image is reduced into a two-dimensional tensor, utilizing a principal component analysis (PCA) scheme. Then, the constructed low-dimensional image is input to our proposed ECA-NET deep network, which exploits the advantages of its core components, i.e., multi-scale residual structure and attention mechanisms. We evaluate the performance of the proposed MRA-NET on three public available hyperspectral datasets and demonstrate that, overall, the classification accuracy of our method is 99.82 %, 99.81%, and 99.37, respectively, which is higher compared to the corresponding accuracy of current networks such as 3D convolutional neural network (CNN), three-dimensional residual convolution structure (RES-3D-CNN), and space–spectrum joint deep network (SSRN).

Download Full-text

The missing fundamentals of harmonic theory: Chord roots and their ambiguity in arrangements of jazz standards

Musicae Scientiae ◽

10.1177/10298649211062934 ◽

2021 ◽

pp. 102986492110629

Author(s):

Richard Parncutt ◽

Lazar Radovanovic

Keyword(s):

Classical Music ◽

Musical Instrument ◽

Digital Interface ◽

Psychological Reality ◽

Jazz Pedagogy ◽

Virtual Pitch ◽

Musical Chords ◽

Original Songs ◽

Western Classical Music ◽

Psychological Experiments

Since Lippius and Rameau, chords have roots that are often voiced in the bass, doubled, and used as labels. Psychological experiments and analyses of databases of Western classical music have not produced clear evidence for the psychological reality of chord roots. We analyzed a symbolic database of 100 arrangements of jazz standards (musical instrument digital interface [MIDI] files from midkar.com and thejazzpage.de ). Selection criteria were representativeness and quality.The original songs had been composed in the 1930s and 1950s, and each file had a beat track. Files were converted to chord progressions by identifying tone onsets near beat locations (±10% of beat duration). Chords were classified as triads (major, minor, diminished, suspended) or seventh chords (major–minor, minor, major, half-diminished, diminished, and suspended) plus extra tones. Roots that were theoretically less ambiguous were more often in the bass or (to a lesser extent) doubled. The root of the minor triad was ambiguous, as predicted (conventional root or third). Of the sevenths, the major–minor had the clearest root. The diminished triad was often part of a major–minor seventh chord; the half-diminished seventh, of a dominant ninth. Added notes (“tensions”) tended to minimize dissonance (roughness or inharmonicity). In arrangements of songs from the 1950s, diminished triads and sevenths were less common, and suspended triads more common, relative to the 1930s. Results confirm the psychological reality of chord roots and their specific ambiguities. Results are consistent with Terhardt’s virtual pitch theory and the idea that musical chords emerge gradually from cultural and historic processes. The approach can enrich music theory (including pitch-class set analysis) and jazz pedagogy.

Download Full-text

Deep Learning based Tomato’s Ripe and Unripe Classification System

International Journal of Software Innovation ◽

10.4018/ijsi.292023 ◽

2022 ◽

Vol 10 (1) ◽

pp. 0-0

Keyword(s):

Deep Learning ◽

Classification Accuracy ◽

Ccd Camera ◽

Agricultural Products ◽

Training Data ◽

Maturity Level ◽

Agriculture Sector ◽

The Past ◽

State Of Art

Effective productivity estimates of fresh produced crops are very essential for efficient farming, commercial planning, and logistical support. In the past ten years, machine learning (ML) algorithms have been widely used for grading and classification of agricultural products in agriculture sector. However, the precise and accurate assessment of the maturity level of tomatoes using ML algorithms is still a quite challenging to achieve due to these algorithms being reliant on hand crafted features. Hence, in this paper we propose a deep learning based tomato maturity grading system that helps to increase the accuracy and adaptability of maturity grading tasks with less amount of training data. The performance of proposed system is assessed on the real tomato datasets collected from the open fields using Nikon D3500 CCD camera. The proposed approach achieved an average maturity classification accuracy of 99.8 % which seems to be quite promising in comparison to the other state of art methods.

Download Full-text

Tobacco Leaf Grading Based on Deep Convolutional Neural Networks and Machine Vision

Journal of the ASABE ◽

10.13031/ja.14537 ◽

2021 ◽

Vol 65 (1) ◽

pp. 11-22

Author(s):

Mengyao Lu ◽

Shuwen Jiang ◽

Cong Wang ◽

Dong Chen ◽

Tian’en Chen

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Transfer Learning ◽

Convolutional Neural Networks ◽

Classification Accuracy ◽

Classification Model ◽

List Type ◽

Tobacco Leaves ◽

Tobacco Leaf ◽

Grading Model

HighlightsA classification model for the front and back sides of tobacco leaves was developed for application in industry.A tobacco leaf grading method that combines a CNN with double-branch integration was proposed.The A-ResNet network was proposed and compared with other classic CNN networks.The grading accuracy of eight different grades was 91.30% and the testing time was 82.180 ms, showing a relatively high classification accuracy and efficiency.Abstract. Flue-cured tobacco leaf grading is a key step in the production and processing of Chinese-style cigarette raw materials, directly affecting cigarette blend and quality stability. At present, manual grading of tobacco leaves is dominant in China, resulting in unsatisfactory grading quality and consuming considerable material and financial resources. In this study, for fast, accurate, and non-destructive tobacco leaf grading, 2,791 flue-cured tobacco leaves of eight different grades in south Anhui Province, China, were chosen as the study sample, and a tobacco leaf grading method that combines convolutional neural networks and double-branch integration was proposed. First, a classification model for the front and back sides of tobacco leaves was trained by transfer learning. Second, two processing methods (equal-scaled resizing and cropping) were used to obtain global images and local patches from the front sides of tobacco leaves. A global image-based tobacco leaf grading model was then developed using the proposed A-ResNet-65 network, and a local patch-based tobacco leaf grading model was developed using the ResNet-34 network. These two networks were compared with classic deep learning networks, such as VGGNet, GoogLeNet-V3, and ResNet. Finally, the grading results of the two grading models were integrated to realize tobacco leaf grading. The tobacco leaf classification accuracy of the final model, for eight different grades, was 91.30%, and grading of a single tobacco leaf required 82.180 ms. The proposed method achieved a relatively high grading accuracy and efficiency. It provides a method for industrial implementation of the tobacco leaf grading and offers a new approach for the quality grading of other agricultural products. Keywords: Convolutional neural network, Deep learning, Image classification, Transfer learning, Tobacco leaf grading

Download Full-text

Multilayer Hybrid Deep-Learning Method for Waste Classification and Recycling

Computational Intelligence and Neuroscience ◽

10.1155/2018/5060857 ◽

2018 ◽

Vol 2018 ◽

pp. 1-9 ◽

Cited By ~ 22

Author(s):

Yinghao Chu ◽

Chen Huang ◽

Xiaodan Xie ◽

Bohai Tan ◽

Shyam Kamal ◽

...

Keyword(s):

Deep Learning ◽

High Resolution ◽

Classification Accuracy ◽

Image Features ◽

Learning System ◽

Multilayer Perceptrons ◽

Learning Method ◽

Waste Classification ◽

Feature Information ◽

Public Area

This study proposes a multilayer hybrid deep-learning system (MHS) to automatically sort waste disposed of by individuals in the urban public area. This system deploys a high-resolution camera to capture waste image and sensors to detect other useful feature information. The MHS uses a CNN-based algorithm to extract image features and a multilayer perceptrons (MLP) method to consolidate image features and other feature information to classify wastes as recyclable or the others. The MHS is trained and validated against the manually labelled items, achieving overall classification accuracy higher than 90% under two different testing scenarios, which significantly outperforms a reference CNN-based method relying on image-only inputs.

Download Full-text