scholarly journals Klasifikasi Belimbing Menggunakan Naïve Bayes Berdasarkan Fitur Warna RGB

Author(s):  
Fuzy Yustika Manik ◽  
Kana Saputra Saragih

Post harvest issues on star fruit are produced on a large scale or industry is sorting. Currently, star fruit classified by rind color analysis visually human eye. This method does not effective and inefficient. The research aims to classify the starfruit sweetness level by using image processing techniques. Features extraction used is the value of Red, Green and Blue (RGB) to obtain the characteristics of the color image. Then the feature extraction results used to classify the star fruit with Naïve Bayes method. Starfruit image data used 120 consisting of 90 training data and 30 testing data. The results showed the classification accuracy using RGB feature extraction by 80%. The use of RGB as the color feature extraction can not be used entirely as a feature of the image extraction of star fruit.

2012 ◽  
Vol 490-495 ◽  
pp. 460-464 ◽  
Author(s):  
Xiao Dan Zhu ◽  
Jin Song Su ◽  
Qing Feng Wu ◽  
Huai Lin Dong

Naive Bayes classification algorithm is an effective simple classification algorithm. Most researches in traditional Naive Bayes classification focus on the improvement of the classification algorithm, ignoring the selection of training data which has a great effect on the performance of classifier. And so a method is proposed to optimize the selection of training data in this paper. Adopting this method, the noisy instances in training data are eliminated by user-defined effectiveness threshold, improving the performance of classifier. Experimental results on large-scale data show that our approach significantly outperforms the baseline classifier.


Sensors ◽  
2020 ◽  
Vol 20 (9) ◽  
pp. 2451 ◽  
Author(s):  
Jin Wang ◽  
Yangning Tang ◽  
Shiming He ◽  
Changqing Zhao ◽  
Pradip Kumar Sharma ◽  
...  

Log anomaly detection is an efficient method to manage modern large-scale Internet of Things (IoT) systems. More and more works start to apply natural language processing (NLP) methods, and in particular word2vec, in the log feature extraction. Word2vec can extract the relevance between words and vectorize the words. However, the computing cost of training word2vec is high. Anomalies in logs are dependent on not only an individual log message but also on the log message sequence. Therefore, the vector of words from word2vec can not be used directly, which needs to be transformed into the vector of log events and further transformed into the vector of log sequences. To reduce computational cost and avoid multiple transformations, in this paper, we propose an offline feature extraction model, named LogEvent2vec, which takes the log event as input of word2vec to extract the relevance between log events and vectorize log events directly. LogEvent2vec can work with any coordinate transformation methods and anomaly detection models. After getting the log event vector, we transform log event vector to log sequence vector by bary or tf-idf and three kinds of supervised models (Random Forests, Naive Bayes, and Neural Networks) are trained to detect the anomalies. We have conducted extensive experiments on a real public log dataset from BlueGene/L (BGL). The experimental results demonstrate that LogEvent2vec can significantly reduce computational time by 30 times and improve accuracy, comparing with word2vec. LogEvent2vec with bary and Random Forest can achieve the best F1-score and LogEvent2vec with tf-idf and Naive Bayes needs the least computational time.


MATICS ◽  
2017 ◽  
Vol 9 (2) ◽  
pp. 53 ◽  
Author(s):  
Aris Diantoro ◽  
Irwan Budi Santoso

<strong><em>Losses in chicken eggs hatchery make breeders income declined. The main cause of these things because it is less effective and efficient in distinguishing the state of fertilities in the eggs. The detection of fertile and infertile eggs will automatically provide ease of selection and removal of the eggs are fertile and infertile eggs. This will bring more profits for breeder as well as time efficiency more and selling power. Infertile eggs will give breeders the sale price if it is known as early as possible in order not to fail hatching. A method fuzzy c means and naive bayes classifier is designed to identify the state of the fertility of eggs. By putting eggs near the source light and black background in a dark room, then taked of image with a high qualities camera. From the resulting camera image, then extracted features or take characteristics that distinguish between fertile and infertile eggs. The total amount of data used in this study of 450 eggs image sourced from the field survey. Training data is used   250 data, 125 fertile eggs image data and 125 infertile eggs image data. As for testing the data using the 200 data, the image data 150 fertile eggs and 50 infertile eggs image data. Based on trial results of training data is obtained the best accuracy is equal to 80% at intervals of 5, 86.4% at intervals of 5 and dimensions 70x60, and 99.6% on 1x2 resize. The accuracy of the results obtained by 78%, 82% and 94% in trials testing data.</em></strong>


Information ◽  
2021 ◽  
Vol 12 (5) ◽  
pp. 204
Author(s):  
Charlyn Villavicencio ◽  
Julio Jerison Macrohon ◽  
X. Alphonse Inbaraj ◽  
Jyh-Horng Jeng ◽  
Jer-Guang Hsieh

A year into the COVID-19 pandemic and one of the longest recorded lockdowns in the world, the Philippines received its first delivery of COVID-19 vaccines on 1 March 2021 through WHO’s COVAX initiative. A month into inoculation of all frontline health professionals and other priority groups, the authors of this study gathered data on the sentiment of Filipinos regarding the Philippine government’s efforts using the social networking site Twitter. Natural language processing techniques were applied to understand the general sentiment, which can help the government in analyzing their response. The sentiments were annotated and trained using the Naïve Bayes model to classify English and Filipino language tweets into positive, neutral, and negative polarities through the RapidMiner data science software. The results yielded an 81.77% accuracy, which outweighs the accuracy of recent sentiment analysis studies using Twitter data from the Philippines.


Entropy ◽  
2019 ◽  
Vol 21 (8) ◽  
pp. 721 ◽  
Author(s):  
YuGuang Long ◽  
LiMin Wang ◽  
MingHui Sun

Due to the simplicity and competitive classification performance of the naive Bayes (NB), researchers have proposed many approaches to improve NB by weakening its attribute independence assumption. Through the theoretical analysis of Kullback–Leibler divergence, the difference between NB and its variations lies in different orders of conditional mutual information represented by these augmenting edges in the tree-shaped network structure. In this paper, we propose to relax the independence assumption by further generalizing tree-augmented naive Bayes (TAN) from 1-dependence Bayesian network classifiers (BNC) to arbitrary k-dependence. Sub-models of TAN that are built to respectively represent specific conditional dependence relationships may “best match” the conditional probability distribution over the training data. Extensive experimental results reveal that the proposed algorithm achieves bias-variance trade-off and substantially better generalization performance than state-of-the-art classifiers such as logistic regression.


2017 ◽  
Vol 9 (4) ◽  
pp. 416 ◽  
Author(s):  
Nelly Indriani Widiastuti ◽  
Ednawati Rainarli ◽  
Kania Evita Dewi

Classification is the process of grouping objects that have the same features or characteristics into several classes. The automatic documents classification use words frequency that appears on training data as features. The large number of documents cause the number of words that appears as a feature will increase. Therefore, summaries are chosen to reduce the number of words that used in classification. The classification uses multiclass Support Vector Machine (SVM) method. SVM was considered to have a good reputation in the classification. This research tests the effect of summary as selection features into documents classification. The summaries reduce text into 50%. A result obtained that the summaries did not affect value accuracy of classification of documents that use SVM. But, summaries improve the accuracy of Simple Logistic Classifier. The classification testing shows that the accuracy of Naïve Bayes Multinomial (NBM) better than SVM


2020 ◽  
Vol 17 (1) ◽  
pp. 37-42
Author(s):  
Yuris Alkhalifi ◽  
Ainun Zumarniansyah ◽  
Rian Ardianto ◽  
Nila Hardi ◽  
Annisa Elfina Augustia

Non-Cash Food Assistance or Bantuan Pangan Non-Tunai (BPNT) is food assistance from the government given to the Beneficiary Family (KPM) every month through an electronic account mechanism that is used only to buy food at the Electronic Shop Mutual Assistance Joint Business Group Hope Family Program (e-Warong KUBE PKH ) or food traders working with Bank Himbara. In its distribution, BPNT still has problems that occur that are experienced by the village apparatus especially the apparatus of Desa Wanasari on making decisions, which ones are worthy of receiving (poor) and not worthy of receiving (not poor). So one way that helps in making decisions can be done through the concept of data mining. In this study, a comparison of 2 algorithms will be carried out namely Naive Bayes Classifier and Decision Tree C.45. The total sample used is as much as 200 head of household data which will then be divided into 2 parts into validation techniques is 90% training data and 10% test data of the total sample used then the proposed model is made in the RapidMiner application and then evaluated using the Confusion Matrix table to find out the highest level of accuracy from 2 of these methods. The results in this classification indicate that the level of accuracy in the Naive Bayes Classifier method is 98.89% and the accuracy level in the Decision Tree C.45 method is 95.00%. Then the conclusion that in this study the algorithm with the highest level of accuracy is the Naive Bayes Classifier algorithm method with a difference in the accuracy rate of 3.89%.


Repositor ◽  
2020 ◽  
Vol 2 (5) ◽  
pp. 675
Author(s):  
Muhammad Athaillah ◽  
Yufiz Azhar ◽  
Yuda Munarko

AbstrakKlasifiaksi berita hoaks merupakan salah satu aplikasi kategorisasi teks. Berita hoaks harus diklasifikasikan karena berita hoaks dapat mempengaruhi tindakan dan pola pikir pembaca. Dalam proses klasifikasi pada penelitian ini menggunakan beberapa tahapan yaitu praproses, ekstraksi fitur, seleksi fitur dan klasifikasi. Penelitian ini bertujuan membandingkan dua algoritma yaitu algoritma Naïve Bayes dan Multinomial Naïve Bayes, manakah dari kedua algoritma tersebut yang lebih efektif dalam mengklasifikasikan berita hoaks. Data yang digunakan dalam penelitian ini berasal dari www.trunbackhoax.id untuk data berita hoaks sebanyak 100 artikel dan data berita non-hoaks berasal dari kompas.com, detik.com berjumlah 100 artikel. Data latih berjumlah 140 artikel dan data uji berjumlah 60 artikel. Hasil perbandingan algoritma Naïve Bayes memiliki nilai F1-score sebesar 0,93 dan nilai F1-score Multinomial Naïve Bayes sebesar 0,92. Abstarct Classification hoax news is one of text categorizations applications. Hoax news must be classified because the hoax news can influence the reader actions and thinking patterns. Classification process in this reseacrh uses several stages, namely  preprocessing, features extraxtion, features selection and classification. This research to compare Naïve Bayes algorithm and Multinomial Naïve Bayes algorithm, which of the two algorithms is more effective on classifying hoax news. The data from this research  from  turnbackhoax.id as hoax news of 100 articles and non-hoax news from kompas.com, detik.com of 100 articles. Training data 140 articles dan test data 60 articles. The result of the comparison of algorithms  Naïve Bayes has an F1-score value of 0,93 and Naïve Bayes has an F1-score value of  0,92.


Sign in / Sign up

Export Citation Format

Share Document