scholarly journals A Preprocessing Strategy for Denoising of Speech Data Based on Speech Segment Detection

2020 ◽  
Vol 10 (20) ◽  
pp. 7385
Author(s):  
Seung-Jun Lee ◽  
Hyuk-Yoon Kwon

In this paper, we propose a preprocessing strategy for denoising of speech data based on speech segment detection. A design of computationally efficient speech denoising is necessary to develop a scalable method for large-scale data sets. Furthermore, it becomes more important as the deep learning-based methods have been developed because they require significant costs while showing high performance in general. The basic idea of the proposed method is using the speech segment detection so as to exclude non-speech segments before denoising. The speech segmentation detection can exclude non-speech segments with a negligible cost, which will be removed in denoising process with a much higher cost, while maintaining the accuracy of denoising. First, we devise a framework to choose the best preprocessing method for denoising based on the speech segment detection for a target environment. For this, we speculate the environments for denoising using different levels of signal-to-noise ratio (SNR) and multiple evaluation metrics. The framework finds the best speech segment detection method tailored to a target environment according to the performance evaluation of speech segment detection methods. Next, we investigate the accuracy of the speech segment detection methods extensively. We conduct the performance evaluation of five speech segment detection methods with different levels of SNRs and evaluation metrics. Especially, we show that we can adjust the accuracy between the precision and recall of each method by controlling a parameter. Finally, we incorporate the best speech segment detection method for a target environment into a denoising process. Through extensive experiments, we show that the accuracy of the proposed scheme is comparable to or even better than that of Wavenet-based denoising, which is one of recent advanced denoising methods based on deep neural networks, in terms of multiple evaluation metrics of denoising, i.e., SNR, STOI, and PESQ, while it can reduce the denoising time of the Wavenet-based denoising by approximately 40–50% according to the used speech segment detection method.

2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Baoying Chen ◽  
Shunquan Tan

Recently, various Deepfake detection methods have been proposed, and most of them are based on convolutional neural networks (CNNs). These detection methods suffer from overfitting on the source dataset and do not perform well on cross-domain datasets which have different distributions from the source dataset. To address these limitations, a new method named FeatureTransfer is proposed in this paper, which is a two-stage Deepfake detection method combining with transfer learning. Firstly, The CNN model pretrained on a third-party large-scale Deepfake dataset can be used to extract the more transferable feature vectors of Deepfake videos in the source and target domains. Secondly, these feature vectors are fed into the domain-adversarial neural network based on backpropagation (BP-DANN) for unsupervised domain adaptive training, where the videos in the source domain have real or fake labels, while the videos in the target domain are unlabelled. The experimental results indicate that the proposed method FeatureTransfer can effectively solve the overfitting problem in Deepfake detection and greatly improve the performance of cross-dataset evaluation.


2020 ◽  
Vol 10 (19) ◽  
pp. 6799
Author(s):  
Zhuoran Ma ◽  
Liang Gao ◽  
Yanglong Zhong ◽  
Shuai Ma ◽  
Bolun An

During the long-term service of slab track, various external factors (such as complicated temperature) can result in a series of slab damages. Among them, slab arching changes the structural mechanical properties, deteriorates the track geometry conditions, and even threatens the operation of trains. Therefore, it is necessary to detect slab arching accurately to achieve effective maintenance. However, the current damage detection methods cannot satisfy high accuracy and low cost simultaneously, making it difficult to achieve large-scale and efficient arching detection. To this end, this paper proposed a vision-based arching detection method using track geometry data. The main works include: (1) data nonlinear deviation correction and arching characteristics analysis; (2) data conversion and augmentation; (3) design and experiments of convolutional neural network- based detection model. The results show that the proposed method can detect arching damages effectively, and the F1-score reaches 98.4%. By balancing the sample size of each pattern, the performance can be further improved. Moreover, the method outperforms the plain deep learning network. In practice, the proposed method can be employed to detect slab arching and help to make maintenance plans. The method can also be applied to the data-based detection of other structural damages and has broad prospects.


2019 ◽  
Vol 20 (S25) ◽  
Author(s):  
Fei Luo

Abstract Background The Copy Number Alterations (CNAs) are discovered to be tightly associated with cancers, so accurately detecting them is one of the most important tasks in the cancer genomics. A series of CNAs detection methods have been proposed and new ones are still being developed. Due to the complexity of CNAs in cancers, no CNAs detection method has been accepted as the gold standard caller. Several evaluation works have made attempts to reveal typical CNAs detection methods’ performance. Limited by the scale of evaluation data, these different comparison works don’t reach a consensus and the researchers are still confused on how to choose one proper CNAs caller for their analysis. Therefore, it needs a more comprehensive evaluation of typical CNAs detection methods’ performance. Results In this work, we use a large-scale real dataset from CAGEKID consortium to evaluate total 12 typical CNAs detection methods. These methods are most widely used in cancer researches and always used as benchmark for the newly proposed CNAs detection methods. This large-scale dataset comprises of SNP array data on 94 samples and the whole genome sequencing data on 10 samples. Evaluations are comprehensively implemented in current scenarios of CNAs detection, which include that detect CNAs on SNP array data, on sequencing data with tumor and normal matched samples and on sequencing data with single tumor sample. Three SNP based methods are firstly ranked. Subsequently, the best SNP based method’s results are used as benchmark to compare six matched samples based methods and three single tumor sample based methods in terms of the preprocessing, recall rate, Jaccard index and segmentation characteristics. Conclusions Our survey thoroughly reveals 12 typical methods’ superiority and inferiority. We explain why methods show specific characteristics from a methodological standpoint. Finally, we present the guiding principle for choosing one proper CNAs detection method under specific conditions. Some unsolved problems and expectations are also addressed for upcoming CNAs detection methods.


2020 ◽  
Vol 10 (18) ◽  
pp. 6274
Author(s):  
Tiantian Zhu ◽  
Zhengqiu Weng ◽  
Lei Fu ◽  
Linqi Ruan

Web shell is a malicious script file that can harm web servers. Web shell is often used by intruders to perform a series of malicious operations on website servers, such as privilege escalation and sensitive information leakage. Existing web shell detection methods have some shortcomings, such as viewing a single network traffic behavior, using simple signature comparisons, and adopting easily bypassed regex matches. In view of the above deficiencies, a web shell detection method based on multiview feature fusion is proposed based on the PHP language web shell. Firstly, lexical features, syntactic features, and abstract features that can effectively represent the internal meaning of web shells from multiple levels are integrated and extracted. Secondly, the Fisher score is utilized to rank and filter the most representative features, according to the importance of each feature. Finally, an optimized support vector machine (SVM) is used to establish a model that can effectively distinguish between web shell and normal script. In large-scale experiments, the final classification accuracy of the model on 1056 web shells and 1056 benign web scripts reached 92.18%. The results also surpassed well-known web shell detection tools such as VirusTotal, ClamAV, LOKI, and CloudWalker, as well as the state-of-the-art web shell detectionmethods.


Author(s):  
Zhiying Mu ◽  
Zhihu Li ◽  
Xiaoyu Li

The correct classifying and filtering of common libraries in Android applications can effectively improve the accuracy of repackaged application detection. However, the existing common library detection methods barely meet the requirement of large-scale app markets due to the low detection speed caused by their classification rules. Aiming at this problem, a structural similarity based common library detection method for Android is presented. The sub-packages with weak association to main package are extracted as common library candidates from the decompiled APK (Android application package) by using PDG (program dependency graph) method. With package structures and API calls being used as features, the classifying of those candidates is accomplished through coarse and fine-grained filtering. The experimental results by using real-world applications as dataset show that the detection speed of the present method is higher while the accuracy and false positive rate are both ensured. The method is proved to be efficient and precise.


2014 ◽  
Vol 556-562 ◽  
pp. 2928-2931
Author(s):  
Wen Lai Liu

The reliability detection method for the traditional large-scale automation software is based on the module design principle of the automation software which detects the reliability features one by one. It does not consider the concurrent reliable chain problems for the automation software which cause the low detection accuracy. The paper proposes a novel automation software system reliability detection method based on the path-based interfaces. The detection model integrates the features of the automation software. The established stochastic points process and state probability transition diagram overcome the shortcomings of the traditional large-scale automation software reliability detection methods. The experiment results illustrate the improved methods can increase the detection accuracy of the large-scale automation software which can be widely applied.


2020 ◽  
Vol 21 ◽  
Author(s):  
Yin-xue Wang ◽  
Yi-xiang Wang ◽  
Yi-ke Li ◽  
Shi-yan Tu ◽  
Yi-qing Wang

: Ovarian cancer (OC) is one of the deadliest gynecological malignancy. Epithelial ovarian cancer (EOC) is its most common form. OC has both a poor prognosis and a high mortality rate due to the difficulties of early diagnosis, the limitation of current treatment and resistance to chemotherapy. Extracellular vesicles is a heterogeneous group of cellderived submicron vesicles which can be detected in body fluids, and it can be classified into three main types including exosomes, micro-vesicles, and apoptotic bodies. Cancer cells can produce more EVs than healthy cells. Moreover, the contents of these EVs have been found distinct from each other. It has been considered that EVs shedding from tumor cells may be implicated in clinical applications. Such as a tool for tumor diagnosis, prognosis and potential treatment of certain cancers. In this review, we provide a brief description of EVs in diagnosis, prognosis, treatment, drug-resistant of OC. Cancer-related EVs show powerful influences on tumors by various biological mechanisms. However, the contents mentioned above remain in the laboratory stage and there is a lack of large-scale clinical trials, and the maturity of the purification and detection methods is a constraint. In addition, amplification of oncogenes on ecDNA is remarkably prevalent in cancer, it may be possible that ecDNA can be encapsulated in EVs and thus detected by us. In summary, much more research on EVs needs to be perform to reveal breakthroughs in OC and to accelerate the process of its application on clinic.


Electronics ◽  
2020 ◽  
Vol 9 (11) ◽  
pp. 1894
Author(s):  
Chun Guo ◽  
Zihua Song ◽  
Yuan Ping ◽  
Guowei Shen ◽  
Yuhei Cui ◽  
...  

Remote Access Trojan (RAT) is one of the most terrible security threats that organizations face today. At present, two major RAT detection methods are host-based and network-based detection methods. To complement one another’s strengths, this article proposes a phased RATs detection method by combining double-side features (PRATD). In PRATD, both host-side and network-side features are combined to build detection models, which is conducive to distinguishing the RATs from benign programs because that the RATs not only generate traffic on the network but also leave traces on the host at run time. Besides, PRATD trains two different detection models for the two runtime states of RATs for improving the True Positive Rate (TPR). The experiments on the network and host records collected from five kinds of benign programs and 20 famous RATs show that PRATD can effectively detect RATs, it can achieve a TPR as high as 93.609% with a False Positive Rate (FPR) as low as 0.407% for the known RATs, a TPR 81.928% and FPR 0.185% for the unknown RATs, which suggests it is a competitive candidate for RAT detection.


Sign in / Sign up

Export Citation Format

Share Document