scholarly journals K-Nearest Robust Active Learning on Big Data and Application in Epitope Prediction

2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Tianchi Lu

B-cells that induce antigen-specific immune responses in vivo produce large numbers of antigen-specific antibodies by recognizing subregions (epitopes) of antigenic proteins, in which they can inhibit the function of antigen protein. Epitope region prediction facilitates the design and development of vaccines that induce the production of antigen-specific antibodies. There are many diseases which are difficult to treat without vaccines. And the COVID-19 has destroyed many people’s lives. Therefore, making vaccines to COVID-19 is very important. Making vaccines needs a large number of experiments to get labeled targets. However, obtaining tremendous labeled data from experiments is a challenge for humans. Big data analysis has proposed some solutions to deal with this challenge. Big data technology has developed very fast and has been applied in many areas. In the bioinformatics area, big data analysis solves a large number of problems, particularly in the area of active learning. Active learning is a method of building more predictive models with less labeled data. Active learning establishes models with less data by asking the oracle (human) for the most valuable samples to train models. Hence, active learning’s application in making vaccines is meaningful that the scientists do not need to do tremendous experiments. This paper proposed a more robust active learning method based on uncertainty sampling and K-nearest density and applies it to the vaccine manufacture. This paper evaluates the new algorithm with accuracy and robustness. In order to evaluate the robustness of active learners, a new robustness index is designed in this paper. And this paper compares the new algorithm with a pool-based active learning algorithm, density-weighted active learning algorithm, and traditional machine learning algorithm. Finally, the new algorithm is applied to epitope prediction of B-cell data, which is significant to making vaccines.

2019 ◽  
Vol 11 (13) ◽  
pp. 3499 ◽  
Author(s):  
Se-Hoon Jung ◽  
Jun-Ho Huh

This study sought to propose a big data analysis and prediction model for transmission line tower outliers to assess when something is wrong with transmission line tower big data based on deep reinforcement learning. The model enables choosing automatic cluster K values based on non-labeled sensor big data. It also allows measuring the distance of action between data inside a cluster with the Q-value representing network output in the altered transmission line tower big data clustering algorithm containing transmission line tower outliers and old Deep Q Network. Specifically, this study performed principal component analysis to categorize transmission line tower data and proposed an automatic initial central point approach through standard normal distribution. It also proposed the A-Deep Q-Learning algorithm altered from the deep Q-Learning algorithm to explore policies based on the experiences of clustered data learning. It can be used to perform transmission line tower outlier data learning based on the distance of data within a cluster. The performance evaluation results show that the proposed model recorded an approximately 2.29%~4.19% higher prediction rate and around 0.8% ~ 4.3% higher accuracy rate compared to the old transmission line tower big data analysis model.


2021 ◽  
Author(s):  
Sadia Jahan ◽  
Md Rafiqul Islam ◽  
Khan Md. Hasib ◽  
Usman Naseem ◽  
Md. Saiful Islam

2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Cheng Zhang ◽  
Xingjun Liu

In recent years, deep learning has made good progress and has been applied to face recognition, video monitoring, image processing, and other fields. In this big data background, deep convolution neural network has also received more and more attention. In order to extract the ancient Chinese characters effectively, the paper will discuss the structure model, pool process, and network training of deep convolution neural network and compare the algorithm with the traditional machine learning algorithm. The results show that the accuracy and recall rate of the Chinese characters in the plaque of Ming Dynasty can reach the peak, 81.38% and 81.31%, respectively. When the number of training samples increases to 50, the recognition rate of MFA is 99.72%, which is much higher than other algorithms. This shows that the algorithm based on deep convolution neural network and big data analysis has excellent performance and can effectively identify the Chinese characters under different dynasties, different sample sizes, and different interference factors, which can provide a powerful reference for the extraction of ancient Chinese characters.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Chayanon Phucharoen ◽  
Tatiyaporn Jarumaneerat ◽  
Nichapat Sangkaew

Purpose Based on big data analytical and statistical techniques, this study aims to examine tourists’ shopping experiences at department stores and street markets in Phuket. Design/methodology/approach A Naïve Bayes machine learning algorithm was used to identify the most frequently used terms in TripAdvisor reviews of both department stores and street markets contributed by the same pool of 729 tourists. Findings A total of 18 out of 62 terms used were common in reviews of both shopping settings. However, the study found significant differences in the mean use of the 18 common terms and the likelihood of those terms being used in overall positive reviews. Practical implications The study’s findings indicate differences in tourist shopping experiences at department stores and street markets. Several concrete recommendations are made, including a greater focus on the linkage to the national characteristic of street markets, and particularly the quality of local fruit, to enhance the tourist shopping experience. Originality/value Understanding the differences between shopping malls and street markets from the tourist’s perspective would further enhance the coexistence of shopping malls and street markets in tourism-led growth cities. As such, using reviews of both shopping malls and street markets from an identical pool of tourists, the present study will analyse and compare tourists’ actual shopping experiences, thereby addressing this gap in the research canon via integrated statistical and big data analysis techniques.


2019 ◽  
Vol 9 (1) ◽  
pp. 01-12 ◽  
Author(s):  
Kristy F. Tiampo ◽  
Javad Kazemian ◽  
Hadi Ghofrani ◽  
Yelena Kropivnitskaya ◽  
Gero Michel

2020 ◽  
Vol 25 (2) ◽  
pp. 18-30
Author(s):  
Seung Wook Oh ◽  
Jin-Wook Han ◽  
Min Soo Kim

Sign in / Sign up

Export Citation Format

Share Document