scholarly journals Comparison of Geographical Traceability of Wild and Cultivated Macrohyporia cocos with Different Data Fusion Approaches

2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Li Wang ◽  
Qinqin Wang ◽  
Yuanzhong Wang ◽  
Yunmei Wang

Poria originated from the dried sclerotium of Macrohyporia cocos is an edible traditional Chinese medicine with high economic value. Due to the significant difference in quality between wild and cultivated M. cocos, this study aimed to trace the origin of the fungus from the perspectives of wild and cultivation. In addition, there were quite limited studies about data fusion, a potential strategy, employed and discussed in the geographical traceability of M. cocos. Therefore, we traced the origin of M. cocos from the perspectives of wild and cultivation using multiple data fusion approaches. Supervised pattern recognition techniques, like partial least squares discriminant analysis (PLS-DA) and random forest, were employed in this study using. Five types of data fusion involving low-, mid-, and high-level data fusion strategies were performed. Two feature extraction approaches including the selecting variables by a random forest-based method—Boruta algorithm and producing principal components by the dimension reduction technique of principal component analysis—were considered in data fusion. The results indicate the following: (1) The difference between wild and cultivated samples did exist in terms of the content analysis of vital chemical components and fingerprint analysis. (2) Wild samples need data fusion to realize the origin traceability, and the accuracy of the validation set was 95.24%. (3) Boruta outperformed principal component analysis (PCA) in feature extraction. (4) The mid-level Boruta PLS-DA model took full advantage of information synergy and showed the best performance. This study proved that both geographical traceability and optimal identification methods of cultivated and wild samples were different, and data fusion was a potential technique in the geographical identification.

2021 ◽  
Author(s):  
Qinqin Wang ◽  
Yuan-Zhong Wang ◽  
Yunmei Wang

Abstract Background Poria originated from the dried sclerotium of Macrohyporia cocos is an edible traditional Chinese medicine with high economic value. Due to the significant difference in quality between wild and cultivated M. cocos, the study aimed to trace the origin of the fungus from the perspectives of wild and cultivation. In addition, there were quite limited studies about data fusion, a potential strategy, employed and discussed in the geographical traceability of M. cocos. Therefore, we traced the origin of M. cocos from the perspectives of wild and cultivation using multiple data fusion approaches. Methods Supervised pattern recognition techniques like partial least squares discriminant analysis (PLS-DA) and random forest, were employed in this study using. Five types of data fusion involving low-, mid- and high-level data fusion strategies were performed. Two feature extraction approaches including the selecting variables by a random forest-based method—Boruta algorithm and producing principal components by the dimension reduction technique of principal component analysis were considered in data fusion. Results (1) the difference of wild and cultivated samples did exist in terms of the content analysis of vital chemical component and fingerprint analysis. (2) the cultivated samples from different origins could be easily identified by Fourier transform infrared spectroscopy or liquid chromatography, while the wild required data fusion. (3) Boruta outperformed principal component analysis (PCA) in feature extraction. (4) Mid-level-Boruta preceded Mid-level-PCA, low-level and high-level data fusion and individual techniques. The Mid-level-Boruta PLS-DA model took full advantage of information synergy and showed the best performance. Conclusions This study proved that both geographical traceability and optimal identification methods of cultivated and wild samples were different, and data fusion was a potential technique in the geographical identification.


2017 ◽  
Vol 1 (1) ◽  
pp. 51
Author(s):  
Darma Setiawan Putra ◽  
Adhi Dharma Wibawa ◽  
Mauridhi Hery Purnomo

Sinyal electromyography (EMG) merupakan suatu sinyal elektrik yang terdapat dalam lapisan otot selama gerakan aktif. Cara orang berjalan ditentukan oleh struktur otot dan tulang sehingga cara berjalan ini adalah unik dan dapat digunakan sebagai data biometrik. Pada penelitian ini, kami mengklasifikasi data EMG dari delapan jenis otot tungkai selama percobaan berjalan normal: Rectus Femoris, Vastus Lateralis, Vastus Medialis, Bicep Femoris, Semitendinosus, Gastrocnemius Lateralis, Gastrocnemius Medialis, dan Tibialis Anterior. Enam orang subyek diminta untuk berjalan di laboratorium GaitLab dengan 8 buah elektroda EMG ditempel pada otot mereka. Subyek diminta untuk berjalan sebanyak 1 gait cycle dengan 3 kali pengambilan data. Total dataset EMG untuk klasifikasi adalah sebanyak 18 buah. Metode graph feature extraction dan principal component analysis digunakan untuk ekstraksi fitur data EMG. Metode Random Forest digunakan untuk mengklasifikasi data EMG berdasarkan subyek. Metode pelatihan dan pengujian data EMG menggunakan cross validation (CV). Akurasi klasifikasi yang dihasilkan dengan menggunakan metode graph feature extraction adalah sebesar 88.88% dan metode principal component analysis adalah sebesar 72.22%. Hasil ini menunjukkan bahwa data EMG ketika berjalan dari 8 jenis otot tungkai dapat digunakan untuk identitas biometrik gaya berjalan (gait).


Polymers ◽  
2021 ◽  
Vol 13 (23) ◽  
pp. 4117
Author(s):  
Y-h. Taguchi ◽  
Turki Turki

The development of the medical applications for substances or materials that contact cells is important. Hence, it is necessary to elucidate how substances that surround cells affect gene expression during incubation. In the current study, we compared the gene expression profiles of cell lines that were in contact with collagen–glycosaminoglycan mesh and control cells. Principal component analysis-based unsupervised feature extraction was applied to identify genes with altered expression during incubation in the treated cell lines but not in the controls. The identified genes were enriched in various biological terms. Our method also outperformed a conventional methodology, namely, gene selection based on linear regression with time course.


Author(s):  
Y-H. Taguchi ◽  
Mitsuo Iwadate ◽  
Hideaki Umeyama ◽  
Yoshiki Murakami ◽  
Akira Okamoto

Feature Extraction (FE) is a difficult task when the number of features is much larger than the number of samples, although that is a typical situation when biological (big) data is analyzed. This is especially true when FE is stable, independent of the samples considered (stable FE), and is often required. However, the stability of FE has not been considered seriously. In this chapter, the authors demonstrate that Principal Component Analysis (PCA)-based unsupervised FE functions as stable FE. Three bioinformatics applications of PCA-based unsupervised FE—detection of aberrant DNA methylation associated with diseases, biomarker identification using circulating microRNA, and proteomic analysis of bacterial culturing processes—are discussed.


Author(s):  
Mohsen Moshki ◽  
Mehran Garmehi ◽  
Peyman Kabiri

In this chapter, application of Principal Component Analysis (PCA) and one of its extensions on intrusion detection is investigated. This extended version of PCA is modified to cover an important shortcoming of traditional PCA. In order to evaluate these modifications, it is mathematically proved that these modifications are beneficial and later on a known dataset such as the DARPA99 dataset is used to verify results experimentally. To verify this approach, initially the traditional PCA is used to preprocess the dataset. Later on, using a simple classifier such as KNN, the effectiveness of the multiclass classification is studied. In the reported work, instead of traditional PCA, a revised version of PCA named Weighted PCA (WPCA) will be used for feature extraction. The results from applying the aforementioned method to the DARPA99 dataset show that this approach results in better accuracy than the traditional PCA when a number of features are limited, a number of classes are large, and a population of classes is unbalanced. In some situations WPCA outperforms traditional PCA by more than 1% in accuracy.


Sign in / Sign up

Export Citation Format

Share Document