scholarly journals Speaker Recognition Using Wavelet Packet Entropy, I-Vector, and Cosine Distance Scoring

2017 ◽  
Vol 2017 ◽  
pp. 1-9 ◽  
Author(s):  
Lei Lei ◽  
She Kun

Today, more and more people have benefited from the speaker recognition. However, the accuracy of speaker recognition often drops off rapidly because of the low-quality speech and noise. This paper proposed a new speaker recognition model based on wavelet packet entropy (WPE), i-vector, and cosine distance scoring (CDS). In the proposed model, WPE transforms the speeches into short-term spectrum feature vectors (short vectors) and resists the noise. I-vector is generated from those short vectors and characterizes speech to improve the recognition accuracy. CDS fast compares with the difference between two i-vectors to give out the recognition result. The proposed model is evaluated by TIMIT speech database. The results of the experiments show that the proposed model can obtain good performance in clear and noisy environment and be insensitive to the low-quality speech, but the time cost of the model is high. To reduce the time cost, the parallel computation is used.

2016 ◽  
Vol 2016 ◽  
pp. 1-11 ◽  
Author(s):  
Lei Lei ◽  
She Kun

An important application of speaker recognition is forensics. However, the accuracy of speaker recognition in forensic cases often drops off rapidly because of the ill effect of ambient noise, variable channel, different duration of speech data, and so on. Therefore, finding a robust speaker recognition model is very important for forensics. This paper builds a new speaker recognition model based on wavelet cepstral coefficient (WCC), i-vector, and cosine distance scoring (CDS). This model firstly uses the WCC to transform the speech into spectral feature vecors and then uses those spectral feature vectors to train the i-vectors that represent the speeches having different durations. CDS is used to compare the i-vectors to give out the evidence. Moreover, linear discriminant analysis (LDA) and the within-class covariance normalization (WCNN) are added to the CDS algorithm to deal with the channel variability problem. Finally, the likelihood ratio estimates the strength of the evidence. We use the TIMIT database to evaluate the performance of the proposed model. The experimental results show that the proposed model can effectively solve the troubles of forensic scenario, but the time cost of the method is high.


Author(s):  
Koosha Choobdari Omran ◽  
Ali Mosallanejad

Purpose Double rotor induction machine (DRIM) is a particular type of induction machine (IM) that has been introduced to improve the parameters of the conventional IM. The purpose of this study is to propose a dynamic model of the DRIM under saturated and unsaturated conditions by using the equations obtained in this paper. Also, skin and temperature effects are considered in this model. Design/methodology/approach First, the DRIM structure and its performance will be briefly reviewed. Then, to realize the DRIM model, the mathematical equations of the electrical and mechanical part of the DRIM will be presented by state equations in the q-d axis by using the Park transformation. In this paper, the magnetizing fluxes saturation is included in the DRIM model by considering the difference between the amplitudes of the unsaturated and saturated magnetizing fluxes. The skin and temperature effects are also considered in this model by correcting the rotor and stator resistances values during operation. Findings To evaluate the effects of the saturation and skin effects on DRIM performance and validate the model, the machine is simulated with/without consideration of saturation and skin effects by the proposed model. Then, the results, including torque, speed, stator and rotor currents, active and reactive power, efficiency, power factor and torque-speed characteristic, are compared. In addition, the performance of the DRIM has been investigated at different speed conditions and load variations. The proposed model is developed in Matlab/Simulink for the sake of validation. Originality/value This paper presents an understandable model of DRIM with and without saturation, which can be used to analyze the steady-state and transient behavior of the motor in different situations.


2017 ◽  
Vol 2017 ◽  
pp. 1-10 ◽  
Author(s):  
Wen-Jun Li ◽  
Qiang Dong ◽  
Yan Fu

As the rapid development of mobile Internet and smart devices, more and more online content providers begin to collect the preferences of their customers through various apps on mobile devices. These preferences could be largely reflected by the ratings on the online items with explicit scores. Both of positive and negative ratings are helpful for recommender systems to provide relevant items to a target user. Based on the empirical analysis of three real-world movie-rating data sets, we observe that users’ rating criterions change over time, and past positive and negative ratings have different influences on users’ future preferences. Given this, we propose a recommendation model on a session-based temporal graph, considering the difference of long- and short-term preferences, and the different temporal effect of positive and negative ratings. The extensive experiment results validate the significant accuracy improvement of our proposed model compared with the state-of-the-art methods.


Author(s):  
P. Vijayalakshmi ◽  
K. Muthumanickam ◽  
G. Karthik ◽  
S. Sakthivel

Adenomyosis is an abnormality in the uterine wall of women that adversely affects their normal life style. If not treated properly, it may lead to severe health issues. The symptoms of adenomyosis are identified from MRI images. It is a gynaecological disease that may lead to infertility. The presence of red dots in the uterus is the major symptom of adenomyosis. The difference in the extent of these red dots extracted from MRI images shows how significant the deviation from normality is. Thus, we proposed an entroxon-based bio-inspired intelligent water drop back-propagation neural network (BIWDNN) model to discover the probability of infertility being caused by adenomyosis and endometriosis. First, vital features from the images are extracted and segmented, and then they are classified using the fuzzy C-means clustering algorithm. The extracted features are then attributed and compared with a normal person’s extracted attributes. The proposed BIWDNN model is evaluated using training and testing datasets and the predictions are estimated using the testing dataset. The proposed model produces an improved diagnostic precision rate on infertility.


2020 ◽  
Vol 12 (11) ◽  
pp. 1746
Author(s):  
Salman Ahmadi ◽  
Saeid Homayouni

In this paper, we propose a novel approach based on the active contours model for change detection from synthetic aperture radar (SAR) images. In order to increase the accuracy of the proposed approach, a new operator was introduced to generate a difference image from the before and after change images. Then, a new model of active contours was developed for accurately detecting changed regions from the difference image. The proposed model extracts the changed areas as a target feature from the difference image based on training data from changed and unchanged regions. In this research, we used the Otsu histogram thresholding method to produce the training data automatically. In addition, the training data were updated in the process of minimizing the energy function of the model. To evaluate the accuracy of the model, we applied the proposed method to three benchmark SAR data sets. The proposed model obtains 84.65%, 87.07%, and 96.26% of the Kappa coefficient for Yellow River Estuary, Bern, and Ottawa sample data sets, respectively. These results demonstrated the effectiveness of the proposed approach compared to other methods. Another advantage of the proposed model is its high speed in comparison to the conventional methods.


NANO ◽  
2009 ◽  
Vol 04 (03) ◽  
pp. 171-176 ◽  
Author(s):  
DAVOOD FATHI ◽  
BEHJAT FOROUZANDEH

This paper introduces a new technique for analyzing the behavior of global interconnects in FPGAs, for nanoscale technologies. Using this new enhanced modeling method, new enhanced accurate expressions for calculating the propagation delay of global interconnects in nano-FPGAs have been derived. In order to verify the proposed model, we have performed the delay simulations in 45 nm, 65 nm, 90 nm, and 130 nm technology nodes, with our modeling method and the conventional Pi-model technique. Then, the results obtained from these two methods have been compared with HSPICE simulation results. The obtained results show a better match in the propagation delay computations for global interconnects between our proposed model and HSPICE simulations, with respect to the conventional techniques such as Pi-model. According to the obtained results, the difference between our model and HSPICE simulations in the mentioned technology nodes is (0.29–22.92)%, whereas this difference is (11.13–38.29)% for another model.


2018 ◽  
Vol 2018 ◽  
pp. 1-6
Author(s):  
Chuen-Lin Tien ◽  
Rong-Ji Lin ◽  
Shang-Min Yeh

Light leakage from liquid crystal displays in the dark state is relatively larger and leads to a degraded contrast ratio and color shift. This work describes a novel colorimetric model based on the Muller matrix that includes depolarization of light propagating through liquid crystal molecules, polarizers, and color filters. In this proposed model, the chromaticity can be estimated in the bump and no-bump regions of an LCD. We indicate that the difference between simulation and measurement of chromaticity is about 0.01. Light leakage in the bump region is three times that in no-bump region in the dark state.


2010 ◽  
Vol 13 (1) ◽  
pp. 17-30
Author(s):  
Luan Hong Pham ◽  
Nhan Thanh Duong

Time-cost optimization problem is one of the most important aspects of construction project management. In order to maximize the return, construction planners would strive to optimize the project duration and cost concurrently. Over the years, many researches have been conducted to model the time-cost relationships; the modeling techniques range from the heuristic method and mathematical approach to genetic algorithm. In this paper, an evolutionary-based optimization algorithm known as ant colony optimization (ACO) is applied to solve the multi-objective time-cost problem. By incorporating with the modified adaptive weight approach (MAWA), the proposed model will find out the most feasible solutions. The concept of the ACO-TCO model is developed by a computer program in the Visual Basic platforms. An example was analyzed to illustrate the capabilities of the proposed model and to compare against GA-based TCO model. The results indicate that ant colony system approach is able to generate better solutions without making the most of computational resources which can provide a useful means to support construction planners and managers in efficiently making better time-cost decisions.


2020 ◽  
Author(s):  
chaofeng lan ◽  
yuanyuan Zhang ◽  
hongyun Zhao

Abstract This paper draws on the training method of Recurrent Neural Network (RNN), By increasing the number of hidden layers of RNN and changing the layer activation function from traditional Sigmoid to Leaky ReLU on the input layer, the first group and the last set of data are zero-padded to enhance the effective utilization of data such that the improved reduction model of Denoise Recurrent Neural Network (DRNN) with high calculation speed and good convergence is constructed to solve the problem of low speaker recognition rate in noisy environment. According to this model, the random semantic speech signal with a sampling rate of 16 kHz and a duration of 5 seconds in the speech library is studied. The experimental settings of the signal-to-noise ratios are − 10dB, -5dB, 0dB, 5dB, 10dB, 15dB, 20dB, 25dB. In the noisy environment, the improved model is used to denoise the Mel Frequency Cepstral Coefficients (MFCC) and the Gammatone Frequency Cepstral Coefficents (GFCC), impact of the traditional model and the improved model on the speech recognition rate is analyzed. The research shows that the improved model can effectively eliminate the noise of the feature parameters and improve the speech recognition rate. When the signal-to-noise ratio is low, the speaker recognition rate can be more obvious. Furthermore, when the signal-to-noise ratio is 0dB, the speaker recognition rate of people is increased by 40%, which can be 85% improved compared with the traditional speech model. On the other hand, with the increase in the signal-to-noise ratio, the recognition rate is gradually increased. When the signal-to-noise ratio is 15dB, the recognition rate of speakers is 93%.


Sign in / Sign up

Export Citation Format

Share Document