scholarly journals The Effect of Training and Testing Process on Machine Learning in Biomedical Datasets

2020 ◽  
Vol 2020 ◽  
pp. 1-17 ◽  
Author(s):  
Muhammed Kürşad Uçar ◽  
Majid Nour ◽  
Hatem Sindi ◽  
Kemal Polat

Training and testing process for the classification of biomedical datasets in machine learning is very important. The researcher should choose carefully the methods that should be used at every step. However, there are very few studies on method choices. The studies in the literature are generally theoretical. Besides, there is no useful model for how to select samples in the training and testing process. Therefore, there is a need for resources in machine learning that discuss the training and testing process in detail and offer new recommendations. This article provides a detailed analysis of the training and testing process in machine learning. The article has the following sections. The third section describes how to prepare the datasets. Four balanced datasets were used for the application. The fourth section describes the rate and how to select samples at the training and testing stage. The fundamental sampling theorem is the subject of statistics. It shows how to select samples. In this article, it has been proposed to use sampling methods in machine learning training and testing process. The fourth section covers the theoretic expression of four different sampling theorems. Besides, the results section has the results of the performance of sampling theorems. The fifth section describes the methods by which training and pretest features can be selected. In the study, three different classifiers control the performance. The results section describes how the results should be analyzed. Additionally, this article proposes performance evaluation methods to evaluate its results. This article examines the effect of the training and testing process on performance in machine learning in detail and proposes the use of sampling theorems for the training and testing process. According to the results, datasets, feature selection algorithms, classifiers, training, and test ratio are the criteria that directly affect performance. However, the methods of selecting samples at the training and testing stages are vital for the system to work correctly. In order to design a stable system, it is recommended that samples should be selected with a stratified systematic sampling theorem.

2019 ◽  
Vol 8 (3) ◽  
pp. 35-37
Author(s):  
R. Ravikumar ◽  
M. Babu Reddy

In machine learning as the dimensionality of the data rises, the amount of data required to provide a reliable analysis grows exponentially. To perform dimensionality reduction on high-dimensional micro array data, many different feature selection and feature extraction methods exist and they are being widely used. All these methods aim to remove redundant and irrelevant features so that classification of new instances will be more accurate. Analyzing microarrays can be difficult due to the size of the data they provide. In addition the complicated relations among the different genes make analysis more difficult and removing excess features can improve the quality of the results. Feature selection has been an active and fruitful field of research area in pattern recognition, machine learning, statistics and data mining communities. The main objective of this paper is feature selection is to choose a subset of input variables by eliminating features.


2018 ◽  
Vol 232 ◽  
pp. 01022
Author(s):  
Zhe Wang ◽  
Baoan Li ◽  
Xueqiang Lv ◽  
Zhian Dong

In this paper, we study the task of template building in automatically generate NBA match reports from NBA live text. As a preliminary study, we collect and process the historical reports compiled by the editors and get different kinds of sentences. Our innovative proposal is to divide the NBA match reports into 11 categories, which covering almost all cases. We use different machine learning methods to classify sentences. Each class finally constructs a template library to service the next automatic writing. By comparing different methods, we get a higher accuracy classification structure. The evaluation results show that our method does construct a template library.


Author(s):  
Damian Alberto

The manual classification of a large amount of textual materials are very costly in time and personnel. For this reason, a lot of research has been devoted to the problem of automatic classification and work on the subject dates from 1960. A lot of text classification software has appeared. For some tasks, automatic classifiers perform almost as well as humans, but for others, the gap is still large. These systems are directly related to machine learning. It aims to achieve tasks normally affordable only by humans. There are generally two types of learning: learning “by heart,” which consists of storing information as is, and learning generalization, where we learn from examples. In this chapter, the authors address the classification concept in detail and how to solve different classification problems using different machine learning techniques.


Author(s):  
Padmavathi .S ◽  
M. Chidambaram

Text classification has grown into more significant in managing and organizing the text data due to tremendous growth of online information. It does classification of documents in to fixed number of predefined categories. Rule based approach and Machine learning approach are the two ways of text classification. In rule based approach, classification of documents is done based on manually defined rules. In Machine learning based approach, classification rules or classifier are defined automatically using example documents. It has higher recall and quick process. This paper shows an investigation on text classification utilizing different machine learning techniques.


Author(s):  
Hyeuk Kim

Unsupervised learning in machine learning divides data into several groups. The observations in the same group have similar characteristics and the observations in the different groups have the different characteristics. In the paper, we classify data by partitioning around medoids which have some advantages over the k-means clustering. We apply it to baseball players in Korea Baseball League. We also apply the principal component analysis to data and draw the graph using two components for axis. We interpret the meaning of the clustering graphically through the procedure. The combination of the partitioning around medoids and the principal component analysis can be used to any other data and the approach makes us to figure out the characteristics easily.


Author(s):  
Aleksey Klokov ◽  
Evgenii Slobodyuk ◽  
Michael Charnine

The object of the research when writing the work was the body of text data collected together with the scientific advisor and the algorithms for processing the natural language of analysis. The stream of hypotheses has been tested against computer science scientific publications through a series of simulation experiments described in this dissertation. The subject of the research is algorithms and the results of the algorithms, aimed at predicting promising topics and terms that appear in the course of time in the scientific environment. The result of this work is a set of machine learning models, with the help of which experiments were carried out to identify promising terms and semantic relationships in the text corpus. The resulting models can be used for semantic processing and analysis of other subject areas.


2020 ◽  
Vol 15 (2) ◽  
pp. 68
Author(s):  
А. Н. Сухов

This given article reveals the topicality not only of destructive, but also of constructive, as well as hybrid conflicts. Practically it has been done for the first time. It also describes the history of the formation of both foreign and domestic social conflictology. At the same time, the chronology of the development of the latter is restored and presented objectively, in full, taking into account the contribution of those researchers who actually stood at its origins. The article deals with the essence of the socio-psychological approach to understanding conflicts. The subject of social conflictology includes the regularities of their occurrence and manifestation at various levels, spheres and conditions, including normal, complicated and extreme ones. Social conflictology includes the theory and practice of diagnosing, resolving, and resolving social conflicts. It analyzes the difficulties that occur in defining the concept, structure, dynamics, and classification of social conflicts. Therefore, it is no accident that the most important task is to create a full-fledged theory of social conflicts. Without this, it is impossible to talk about effective settlement and resolution of social conflicts. Social conflictology is an integral part of conflictology. There is still a lot of work to be done, both in theory and in application, for its complete design. At present, there is an urgent need to develop conflict-related competence not only of professionals, but also for various groups of the population.


Sign in / Sign up

Export Citation Format

Share Document