Housing Value Forecasting Based on Machine Learning Methods

Abstract and Applied Analysis ◽

10.1155/2014/648047 ◽

2014 ◽

Vol 2014 ◽

pp. 1-7 ◽

Cited By ~ 8

Author(s):

Jingyi Mu ◽

Fang Wu ◽

Aihua Zhang

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Big Data ◽

Least Squares ◽

Optimal Solution ◽

Support Vector ◽

Learning Methods ◽

Data Set ◽

Machine Learning Methods ◽

The Government

In the era of big data, many urgent issues to tackle in all walks of life all can be solved via big data technique. Compared with the Internet, economy, industry, and aerospace fields, the application of big data in the area of architecture is relatively few. In this paper, on the basis of the actual data, the values of Boston suburb houses are forecast by several machine learning methods. According to the predictions, the government and developers can make decisions about whether developing the real estate on corresponding regions or not. In this paper, support vector machine (SVM), least squares support vector machine (LSSVM), and partial least squares (PLS) methods are used to forecast the home values. And these algorithms are compared according to the predicted results. Experiment shows that although the data set exists serious nonlinearity, the experiment result also show SVM and LSSVM methods are superior to PLS on dealing with the problem of nonlinearity. The global optimal solution can be found and best forecasting effect can be achieved by SVM because of solving a quadratic programming problem. In this paper, the different computation efficiencies of the algorithms are compared according to the computing times of relevant algorithms.

Download Full-text

Prediction of Collapsibility of Loess of Construction Sites in Xining Based on Machine Learning Methods

10.21203/rs.3.rs-307514/v1 ◽

2021 ◽

Author(s):

Qifei Zhao ◽

Xiaojun Li ◽

Yunning Cao ◽

Zhikun Li ◽

Jixin Fan

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Training Data ◽

Support Vector ◽

Engineering Practice ◽

Burial Depth ◽

Learning Methods ◽

Data Set ◽

Machine Learning Methods ◽

North East

Abstract Collapsibility of loess is a significant factor affecting engineering construction in loess area, and testing the collapsibility of loess is costly. In this study, A total of 4,256 loess samples are collected from the north, east, west and middle regions of Xining. 70% of the samples are used to generate training data set, and the rest are used to generate verification data set, so as to construct and validate the machine learning models. The most important six factors are selected from thirteen factors by using Grey Relational analysis and multicollinearity analysis: burial depth、water content、specific gravity of soil particles、void rate、geostatic stress and plasticity limit. In order to predict the collapsibility of loess, four machine learning methods: Support Vector Machine (SVM), Random Subspace Based Support Vector Machine (RSSVM), Random Forest (RF) and Naïve Bayes Tree (NBTree), are studied and compared. The receiver operating characteristic (ROC) curve indicators, standard error (SD) and 95% confidence interval (CI) are used to verify and compare the models in different research areas. The results show that: RF model is the most efficient in predicting the collapsibility of loess in Xining, and its AUC average is above 80%, which can be used in engineering practice.

Download Full-text

Identifying Cancer Targets Based on Machine Learning Methods via Chou’s 5-steps Rule and General Pseudo Components

Current Topics in Medicinal Chemistry ◽

10.2174/1568026619666191016155543 ◽

2019 ◽

Vol 19 (25) ◽

pp. 2301-2317 ◽

Cited By ~ 2

Author(s):

Ruirui Liang ◽

Jiayang Xie ◽

Chi Zhang ◽

Mengying Zhang ◽

Hai Huang ◽

...

Keyword(s):

Machine Learning ◽

Growth Rate ◽

Big Data ◽

Human Genome Project ◽

Genome Project ◽

Support Vector ◽

Successful Implementation ◽

Learning Methods ◽

Machine Learning Methods ◽

Vector Machines

In recent years, the successful implementation of human genome project has made people realize that genetic, environmental and lifestyle factors should be combined together to study cancer due to the complexity and various forms of the disease. The increasing availability and growth rate of ‘big data’ derived from various omics, opens a new window for study and therapy of cancer. In this paper, we will introduce the application of machine learning methods in handling cancer big data including the use of artificial neural networks, support vector machines, ensemble learning and naïve Bayes classifiers.

Download Full-text

Assessing Replicability of Machine Learning Results: An Introduction to Methods on Predictive Accuracy in Social Sciences

Social Science Computer Review ◽

10.1177/0894439319888445 ◽

2019 ◽

pp. 089443931988844

Author(s):

Ranjith Vijayakumar ◽

Mike W.-L. Cheung

Keyword(s):

Machine Learning ◽

Empirical Data ◽

Fixed Effects ◽

Predictive Accuracy ◽

Support Vector ◽

Learning Methods ◽

Data Set ◽

Replication Studies ◽

Machine Learning Methods ◽

Accuracy Measure

Machine learning methods have become very popular in diverse fields due to their focus on predictive accuracy, but little work has been conducted on how to assess the replicability of their findings. We introduce and adapt replication methods advocated in psychology to the aims and procedural needs of machine learning research. In Study 1, we illustrate these methods with the use of an empirical data set, assessing the replication success of a predictive accuracy measure, namely, R 2 on the cross-validated and test sets of the samples. We introduce three replication aims. First, tests of inconsistency examine whether single replications have successfully rejected the original study. Rejection will be supported if the 95% confidence interval (CI) of R 2 difference estimates between replication and original does not contain zero. Second, tests of consistency help support claims of successful replication. We can decide apriori on a region of equivalence, where population values of the difference estimates are considered equivalent for substantive reasons. The 90% CI of a different estimate lying fully within this region supports replication. Third, we show how to combine replications to construct meta-analytic intervals for better precision of predictive accuracy measures. In Study 2, R 2 is reduced from the original in a subset of replication studies to examine the ability of the replication procedures to distinguish true replications from nonreplications. We find that when combining studies sampled from same population to form meta-analytic intervals, random-effects methods perform best for cross-validated measures while fixed-effects methods work best for test measures. Among machine learning methods, regression was comparable to many complex methods, while support vector machine performed most reliably across a variety of scenarios. Social scientists who use machine learning to model empirical data can use these methods to enhance the reliability of their findings.

Download Full-text

Fast-forward solver for inhomogeneous media using machine learning methods: artificial neural network, support vector machine and fuzzy logic

Neural Computing and Applications ◽

10.1007/s00521-016-2694-9 ◽

2016 ◽

Vol 29 (12) ◽

pp. 1583-1591 ◽

Cited By ~ 4

Author(s):

Mohammad Abdolrazzaghi ◽

Soheil Hashemy ◽

Ali Abdolali

Keyword(s):

Neural Network ◽

Machine Learning ◽

Artificial Neural Network ◽

Support Vector Machine ◽

Fuzzy Logic ◽

Inhomogeneous Media ◽

Support Vector ◽

Learning Methods ◽

Network Support ◽

Machine Learning Methods

Download Full-text

Retrieval of aerosol optical depth from surface solar radiation measurements using machine learning algorithms, non-linear regression and a radiative transfer-based look-up table

Atmospheric Chemistry and Physics ◽

10.5194/acp-16-8181-2016 ◽

2016 ◽

Vol 16 (13) ◽

pp. 8181-8191 ◽

Cited By ~ 10

Author(s):

Jani Huttunen ◽

Harri Kokkola ◽

Tero Mielonen ◽

Mika Esa Juhani Mononen ◽

Antti Lipponen ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Support Vector Machine ◽

Linear Regression ◽

Support Vector ◽

Learning Methods ◽

Surface Solar Radiation ◽

Machine Learning Methods ◽

Look Up Table ◽

Non Linear

Abstract. In order to have a good estimate of the current forcing by anthropogenic aerosols, knowledge on past aerosol levels is needed. Aerosol optical depth (AOD) is a good measure for aerosol loading. However, dedicated measurements of AOD are only available from the 1990s onward. One option to lengthen the AOD time series beyond the 1990s is to retrieve AOD from surface solar radiation (SSR) measurements taken with pyranometers. In this work, we have evaluated several inversion methods designed for this task. We compared a look-up table method based on radiative transfer modelling, a non-linear regression method and four machine learning methods (Gaussian process, neural network, random forest and support vector machine) with AOD observations carried out with a sun photometer at an Aerosol Robotic Network (AERONET) site in Thessaloniki, Greece. Our results show that most of the machine learning methods produce AOD estimates comparable to the look-up table and non-linear regression methods. All of the applied methods produced AOD values that corresponded well to the AERONET observations with the lowest correlation coefficient value being 0.87 for the random forest method. While many of the methods tended to slightly overestimate low AODs and underestimate high AODs, neural network and support vector machine showed overall better correspondence for the whole AOD range. The differences in producing both ends of the AOD range seem to be caused by differences in the aerosol composition. High AODs were in most cases those with high water vapour content which might affect the aerosol single scattering albedo (SSA) through uptake of water into aerosols. Our study indicates that machine learning methods benefit from the fact that they do not constrain the aerosol SSA in the retrieval, whereas the LUT method assumes a constant value for it. This would also mean that machine learning methods could have potential in reproducing AOD from SSR even though SSA would have changed during the observation period.

Download Full-text

Retrieval of aerosol optical depth from surface solar radiation measurements using machine learning algorithms, nonlinear regression and a radiative transfer based look-up table

10.5194/acp-2016-58 ◽

2016 ◽

Author(s):

J. Huttunen ◽

H. Kokkola ◽

T. Mielonen ◽

M. Mononen ◽

A. Lipponen ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Support Vector Machine ◽

Random Forest ◽

Nonlinear Regression ◽

Support Vector ◽

Learning Methods ◽

Surface Solar Radiation ◽

Machine Learning Methods ◽

Look Up Table

Abstract. In order to have a good estimate of the current forcing by anthropogenic aerosols knowledge on past aerosol levels is needed. Aerosol optical depth (AOD) is a good measure for aerosol loading. However, dedicated measurements of AOD are only available from 1990’s onward. One option to lengthen the AOD time series beyond 1990’s is to retrieve AOD from surface solar radiation (SSR) measurements done with pyranometers. In this work, we have evaluated several inversion methods designed for this task. We compared a look-up table method based on radiative transfer modelling, a nonlinear regression method and four machine learning methods (Gaussian Process, Neural Network, Random Forest and Support Vector Machine) with AOD observations done with a sun photometer at an Aerosol Robotic Network (AERONET) site in Thessaloniki, Greece. Our results show that most of the machine learning methods produce AOD estimates comparable to the look-up table and nonlinear regression methods. All of the applied methods produced AOD values that corresponded well to the AERONET observations with the lowest correlation coefficient value being 0.87 for the Random Forest method. While many of the methods tended to slightly overestimate low AODs and underestimate high AODs, Neural network and support vector machine showed overall better correspondence for the whole AOD range. The differences in producing both ends of the AOD range seem to be caused by differences in the aerosol composition. High AODs were in most cases those with high water vapour content which might affect the aerosol single scattering albedo (SSA) through uptake of water into aerosols. Our study indicates that machine learning methods benefit from the fact that they do not constrain the aerosol SSA in the retrieval where as the LUT method assumes a constant value for it. This would also mean that machine learning methods could have potential in reproducing AOD from SSR even though SSA would have changed during the observation period.

Download Full-text

Breast Cancer Prediction Using Machine Learning Algorithm with Big Data Concept

International Journal of Scientific Research in Science Engineering and Technology ◽

10.32628/ijsrset1207232 ◽

2020 ◽

pp. 123-127

Author(s):

R. Nirmalan ◽

M. Javith Hussain Khan ◽

V. Sounder ◽

A. Manikkaraja

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Support Vector Machine ◽

Big Data ◽

Learning Algorithm ◽

Support Vector ◽

Data Set ◽

Cancer Prediction ◽

Modern Computer ◽

Huge Data

The evolution in modern computer technology produce an huge amount of data by the way of using updated technology world with the lot and lot of inventions. The algorithms which we used in machine-learning traditionally might not support the concept of big data. Here we have discussed and implemented the solution for the problem, while predicting breast cancer using big data. DNA methylation (DM) as well gene expression (GE) are the two types of data used for the prediction of breast cancer. The main objective is to classify individual data set in the separate manner. To achieve this main objective, we have used a platform Apache Spark. Here,we have applied three types of algorithms used for classification, they are decision tree, random forest algorithm, support vector machine algorithm which will be mentioned as SVM .These three types of algorithm used for producing models used for breast cancer prediction. Analyze have done for finding which algorithm will produce the better result with good accuracy and less error rate. Additionally, the platforms like Weka and Spark are compared, to find which will have the better performance while dealing with the huge data. The obtained outcome have proved that the Support Vector Machine classifier which is scalable might given the better performance than all other classifiers and it have achieved the lowest error range with the highest accuracy using GE data set

Download Full-text

Comparative Study on Theoretical and Machine Learning Methods for Acquiring Compressed Liquid Densities of 1,1,1,2,3,3,3-Heptafluoropropane (R227ea) via Song and Mason Equation, Support Vector Machine, and Artificial Neural Networks

Applied Sciences ◽

10.3390/app6010025 ◽

2016 ◽

Vol 6 (1) ◽

pp. 25 ◽

Cited By ~ 18

Author(s):

Hao Li ◽

Xindong Tang ◽

Run Wang ◽

Fan Lin ◽

Zhijian Liu ◽

...

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Support Vector Machine ◽

Artificial Neural Networks ◽

Comparative Study ◽

Support Vector ◽

Learning Methods ◽

Machine Learning Methods ◽

Compressed Liquid ◽

Artificial Neural

Download Full-text

Comparison of machine learning methods for stationary wavelet entropy-based multiple sclerosis detection: decision tree,k-nearest neighbors, and support vector machine

SIMULATION ◽

10.1177/0037549716666962 ◽

2016 ◽

Vol 92 (9) ◽

pp. 861-871 ◽

Cited By ~ 54

Author(s):

Yudong Zhang ◽

Siyuan Lu ◽

Xingxing Zhou ◽

Ming Yang ◽

Lenan Wu ◽

...

Keyword(s):

Machine Learning ◽

Multiple Sclerosis ◽

Support Vector Machine ◽

Decision Tree ◽

Nearest Neighbors ◽

Support Vector ◽

Learning Methods ◽

K Nearest Neighbors ◽

Wavelet Entropy ◽

Machine Learning Methods

Download Full-text

Design of an Intelligent Variable-Flow Recirculating Aquaculture System Based on Machine Learning Methods

Applied Sciences ◽

10.3390/app11146546 ◽

2021 ◽

Vol 11 (14) ◽

pp. 6546

Author(s):

Fudi Chen ◽

Yishuai Du ◽

Tianlong Qiu ◽

Zhe Xu ◽

Li Zhou ◽

...

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Support Vector ◽

Recirculating Aquaculture System ◽

Learning Methods ◽

Variable Flow ◽

Machine Learning Methods ◽

Recirculating Aquaculture ◽

Aquaculture System ◽

Regulation Model

A recirculating aquaculture system (RAS) can reduce water and land requirements for intensive aquaculture production. However, a traditional RAS uses a fixed circulation flow rate for water treatment. In general, the water in an RAS is highly turbid only when the animals are fed and when they excrete. Therefore, RAS water quality regulation technology based on process control is proposed in this paper. The intelligent variable-flow RAS was designed based on the circulating pump-drum filter linkage working model. Machine learning methods were introduced to develop the intelligent regulation model to maintain a clean and stable water environment. Results showed that the long short-term memory network performed with the highest accuracy (training set 100%, test set 96.84%) and F1-score (training 100%, test 93.83%) among artificial neural networks. Optimization methods including grid search, cuckoo search, linear squares, and gene algorithm were proposed to improve the classification ability of support vector machine models. Results showed that all support vector machine models passed cross-validation and could meet accuracy standards. In summary, the gene algorithm support vector machine model (accuracy: training 100%, test 98.95%; F1-score: training 100%, test 99.17%) is suitable as an optimal variable-flow regulation model for an intelligent variable-flow RAS.

Download Full-text