scholarly journals Learning Low-Dimensional Embeddings of Audio Shingles for Cross-Version Retrieval of Classical Music

2019 ◽  
Vol 10 (1) ◽  
pp. 19 ◽  
Author(s):  
Frank Zalkow ◽  
Meinard Müller

Cross-version music retrieval aims at identifying all versions of a given piece of music using a short query audio fragment. One previous approach, which is particularly suited for Western classical music, is based on a nearest neighbor search using short sequences of chroma features, also referred to as audio shingles. From the viewpoint of efficiency, indexing and dimensionality reduction are important aspects. In this paper, we extend previous work by adapting two embedding techniques; one is based on classical principle component analysis, and the other is based on neural networks with triplet loss. Furthermore, we report on systematically conducted experiments with Western classical music recordings and discuss the trade-off between retrieval quality and embedding dimensionality. As one main result, we show that, using neural networks, one can reduce the audio shingles from 240 to fewer than 8 dimensions with only a moderate loss in retrieval accuracy. In addition, we present extended experiments with databases of different sizes and different query lengths to test the scalability and generalizability of the dimensionality reduction methods. We also provide a more detailed view into the retrieval problem by analyzing the distances that appear in the nearest neighbor search.

2019 ◽  
Vol 11 (1) ◽  
pp. 168781401881917
Author(s):  
Fang Lv ◽  
Yuliang Wei ◽  
Xixian Han ◽  
Bailing Wang

With the explosive growth of surveillance data, exact match queries become much more difficult for its high dimension and high volume. Owing to its good balance between the retrieval performance and the computational cost, hash learning technique is widely used in solving approximate nearest neighbor search problems. Dimensionality reduction plays a critical role in hash learning, as its target is to preserve the most original information into low-dimensional vectors. However, the existing dimensionality reduction methods neglect to unify diverse resources in original space when learning a downsized subspace. In this article, we propose a numeric and semantic consistency semi-supervised hash learning method, which unifies the numeric features and supervised semantic features into a low-dimensional subspace before hash encoding, and improves a multiple table hash method with complementary numeric local distribution structure. A consistency-based learning method, which confers the meaning of semantic to numeric features in dimensionality reduction, is presented. The experiments are conducted on two public datasets, that is, a web image NUS-WIDE and text dataset DBLP. Experimental results demonstrate that the semi-supervised hash learning method, with the consistency-based information subspace, is more effective in preserving useful information for hash encoding than state-of-the-art methods and achieves high-quality retrieval performance in multi-table context.


Author(s):  
A. Murat Yagci ◽  
Tevfik Aytekin ◽  
Fikret S. Gurgen

Matrix factorization models often reveal the low-dimensional latent structure in high-dimensional spaces while bringing space efficiency to large-scale collaborative filtering problems. Improving training and prediction time efficiencies of these models are also important since an accurate model may raise practical concerns if it is slow to capture the changing dynamics of the system. For the training task, powerful improvements have been proposed especially using SGD, ALS, and their parallel versions. In this paper, we focus on the prediction task and combine matrix factorization with approximate nearest neighbor search methods to improve the efficiency of top-N prediction queries. Our efforts result in a meta-algorithm, MMFNN, which can employ various common matrix factorization models, drastically improve their prediction efficiency, and still perform comparably to standard prediction approaches or sometimes even better in terms of predictive power. Using various batch, online, and incremental matrix factorization models, we present detailed empirical analysis results on many large implicit feedback datasets from different application domains.


Author(s):  
A. MITICHE ◽  
J. K. AGGARWAL

The purpose of this paper is two-fold: to give a synoptic description of favored neural networks and to characterize the potency of these neural networks as pattern classifiers, against the background of the familiar nearest neighbors classification. We limit the study to those neural network structures most commonly used for pattern classification: the multilayer perceptron, the Kohonen associative memory, and the Carpenter–Grossberg clustering network, for which we give a tutorial description with the aim of making the driving concepts apparent. The nearest neighbors rule is presented with improved nearest neighbor search and reference data sample pruning. To gain some familiarity with the classifiers, we expound the sequence of computations implicated in pattern category assignment by each classifier. A characterization of the classifiers is drawn from observed and expected properties and from experiments in automatic target recognition and optical character recognition as summarized in comparative tables of performance. This characterization supports the suggestion that nearest neighbors classification always be considered before endorsing alternative pattern classifiers such as neural networks.


2020 ◽  
Author(s):  
Cameron Hargreaves ◽  
Matthew Dyer ◽  
Michael Gaultois ◽  
Vitaliy Kurlin ◽  
Matthew J Rosseinsky

It is a core problem in any field to reliably tell how close two objects are to being the same, and once this relation has been established we can use this information to precisely quantify potential relationships, both analytically and with machine learning (ML). For inorganic solids, the chemical composition is a fundamental descriptor, which can be represented by assigning the ratio of each element in the material to a vector. These vectors are a convenient mathematical data structure for measuring similarity, but unfortunately, the standard metric (the Euclidean distance) gives little to no variance in the resultant distances between chemically dissimilar compositions. We present the Earth Mover’s Distance (EMD) for inorganic compositions, a well-defined metric which enables the measure of chemical similarity in an explainable fashion. We compute the EMD between two compositions from the ratio of each of the elements and the absolute distance between the elements on the modified Pettifor scale. This simple metric shows clear strength at distinguishing compounds and is efficient to compute in practice. The resultant distances have greater alignment with chemical understanding than the Euclidean distance, which is demonstrated on the binary compositions of the Inorganic Crystal Structure Database (ICSD). The EMD is a reliable numeric measure of chemical similarity that can be incorporated into automated workflows for a range of ML techniques. We have found that with no supervision the use of this metric gives a distinct partitioning of binary compounds into clear trends and families of chemical property, with future applications for nearest neighbor search queries in chemical database retrieval systems and supervised ML techniques.


Sign in / Sign up

Export Citation Format

Share Document