Effects of network topology on the performance of consensus and distributed learning of SVMs using ADMM

PeerJ Computer Science ◽

10.7717/peerj-cs.397 ◽

2021 ◽

Vol 7 ◽

pp. e397

Author(s):

Shirin Tavara ◽

Alexander Schliep

Keyword(s):

Network Topology ◽

Large Scale ◽

Learning Problems ◽

Support Vector ◽

Spectral Gaps ◽

Distributed Framework ◽

Alternating Direction ◽

Communication Time ◽

Network Topologies ◽

The Impact

The Alternating Direction Method of Multipliers (ADMM) is a popular and promising distributed framework for solving large-scale machine learning problems. We consider decentralized consensus-based ADMM in which nodes may only communicate with one-hop neighbors. This may cause slow convergence. We investigate the impact of network topology on the performance of an ADMM-based learning of Support Vector Machine using expander, and mean-degree graphs, and additionally some of the common modern network topologies. In particular, we investigate to which degree the expansion property of the network influences the convergence in terms of iterations, training and communication time. We furthermore suggest which topology is preferable. Additionally, we provide an implementation that makes these theoretical advances easily available. The results show that the performance of decentralized ADMM-based learning of SVMs in terms of convergence is improved using graphs with large spectral gaps, higher and homogeneous degrees.

Download Full-text

MODELING THE PERFORMANCE OF COMMUNICATION SCHEMES ON NETWORK TOPOLOGIES

Parallel Processing Letters ◽

10.1142/s012962640800334x ◽

2008 ◽

Vol 18 (02) ◽

pp. 205-220

Author(s):

JAN LEMEIRE ◽

ERIK DIRKX ◽

WALTER COLITTI

Keyword(s):

Network Topology ◽

Structure Learning ◽

Interconnection Network ◽

Causal Structure ◽

Quantitative Prediction ◽

Communication Performance ◽

Communication Time ◽

Communication Scheme ◽

Network Topologies ◽

The Impact

This paper investigates the influence of the interconnection network topology of a parallel system on the delivery time of an ensemble of messages, called the communication scheme. More specifically, we focus on the impact on the performance of structure in network topology and communication scheme. We introduce causal structure learning algorithms for the modeling of the communication time. The experimental data, from which the models are learned automatically, is retrieved from simulations. The qualitative models provide insight about which and how variables influence the communication performance. Next, a generic property is defined which characterizes the performance of individual communication schemes and network topologies. The property allows the accurate quantitative prediction of the runtime of random communication on random topologies. However, when either communication scheme or network topology exhibit regularities the prediction can become very inaccurate. The causal models can also differ qualitatively and quantitatively. Each combination of communication scheme regularity type, e.g. a one-to-all broadcast, and network topology regularity type, e.g. torus, possibly results in a different model which is based on different characteristics.

Download Full-text

Solving large-scale multiclass learning problems via an efficient support vector classifier

Journal of Systems Engineering and Electronics ◽

10.1016/s1004-4132(07)60036-x ◽

2006 ◽

Vol 17 (4) ◽

pp. 910-915 ◽

Cited By ~ 3

Author(s):

Zheng Shuibo ◽

Tang Houjun ◽

Han Zhengzhi ◽

Zhang Haoran

Keyword(s):

Large Scale ◽

Learning Problems ◽

Support Vector ◽

Support Vector Classifier

Download Full-text

Detection of sand deposition in pipeline using percussion, voice recognition, and support vector machine

Structural Health Monitoring ◽

10.1177/1475921720918890 ◽

2020 ◽

Vol 19 (6) ◽

pp. 2075-2090 ◽

Cited By ~ 2

Author(s):

Hao Cheng ◽

Furui Wang ◽

Linsheng Huo ◽

Gangbing Song

Keyword(s):

Support Vector Machine ◽

Large Scale ◽

Rapid Development ◽

Voice Recognition ◽

Support Vector ◽

Machine Model ◽

Frequency Scale ◽

Removal Time ◽

The Impact ◽

Non Destructive

Deposits prevention and removal in pipeline has great importance to ensure pipeline operation. Selecting a suitable removal time based on the composition and mass of the deposits not only reduces cost but also improves efficiency. In this article, we develop a new non-destructive approach using the percussion method and voice recognition with support vector machine to detect the sandy deposits in the steel pipeline. Particularly, as the mass of sandy deposits in the pipeline changes, the impact-induced sound signals will be different. A commonly used voice recognition feature, Mel-Frequency Cepstrum Coefficients, which represent the result of a cosine transform of the real logarithm of the short-term energy spectrum on a Mel-frequency scale, is adopted in this research and Mel-Frequency Cepstrum Coefficients are extracted from the obtained sound signals. A support vector machine model was employed to identify the sandy deposits with different mass values by classifying energy summation and Mel-Frequency Cepstrum Coefficients. In addition, the classification accuracies of energy summation and Mel-Frequency Cepstrum Coefficients are compared. The experimental results demonstrated that Mel-Frequency Cepstrum Coefficients perform better in pipeline deposits detection and have great potential in acoustic recognition for structural health monitoring. In addition, the proposed Mel-Frequency Cepstrum Coefficients–based pipeline deposits monitoring model can estimate the deposits in the pipeline with high accuracy. Moreover, compared with current non-destructive deposits detection approaches, the percussion method is easy to implement. With the rapid development of artificial intelligence and acoustic recognition, the proposed method can realize higher accuracy and higher speed in the detection of pipeline deposits, and has great application potential in the future. In addition, the proposed percussion method can enable robotic-based inspection for large-scale implementation.

Download Full-text

Disentangling network topology and pathogen spread

10.1101/2021.05.24.21257706 ◽

2021 ◽

Author(s):

Maria Perez-Ortiz ◽

Petru Manescu ◽

Fabio Caccioli ◽

Delmiro Fernandez-Reyes ◽

Parashkev Nachev ◽

...

Keyword(s):

Network Topology ◽

Large Scale ◽

Social Contact ◽

Small World ◽

Pathogen Transmission ◽

Stochastic Simulations ◽

Equal Weight ◽

Topological Features ◽

Network Metrics ◽

The Impact

How do we best constrain social interactions to prevent the transmission of communicable respiratory diseases? Indiscriminate suppression, the currently accepted answer, is both unsustainable long term and implausibly presupposes all interactions to carry equal weight. Transmission within a social network is determined by the topology of its graphical structure, of which the number of interactions is only one aspect. Here we deploy large-scale numerical simulations to quantify the impact on pathogen transmission of a set of topological features covering the parameter space of realistic possibility. We first test through a series of stochastic simulations the differences in the spread of disease on several classes of network geometry (including highly skewed networks and small world). We then aim to characterise the spread based on the characteristics of the network topology using regression analysis, highlighting some of the network metrics that influence the spread the most. For this, we build a dataset composed of more than 9000 social networks and 30 topological network metrics. We find that pathogen spread is optimally reduced by limiting specific kinds of social contact -- unfamiliar and long range -- rather than their global number. Our results compel a revaluation of social interventions in communicable diseases, and the optimal approach to crafting them.

Download Full-text

Ecological Environment Changes of Mining Areas Around Nansi Lake With Remote Sensing Monitoring

10.21203/rs.3.rs-186720/v1 ◽

2021 ◽

Author(s):

Hu Liu ◽

Yan Jiang ◽

Rafal Misa ◽

Junhai Gao ◽

Mingyu Xia ◽

...

Keyword(s):

Remote Sensing ◽

Coal Mining ◽

Large Scale ◽

Underground Mining ◽

Water Area ◽

Ecological Environment ◽

Support Vector ◽

Svm Classifier ◽

Nansi Lake ◽

The Impact

Abstract Underground mining activity has existed for more than 100 years in Nansi lake. Coal mining not only plays a supporting role in local social and economic development but also has a significant impact on the ecological environment in the region. Landsat series remote sensing data (1988~2019) are used to research the impact of coal mining on the ecological environment in Nansi lake. Then Support Vector Machine (SVM) classifier is applied to extract the water area of the upstream lake from 1988 to 2019, and ecological environment and spatiotemporal variation characteristics are analyzed by Remote Sensing Ecology Index (RSEI). The results illustrate that the water area change is associated with annual precipitation. Compared with 2009, the ecological quality of the lake is worse in 2019, and then the reason for this change is due to large-scale underground mining. Therefore, the coal mines from the natural reserve may be closed or limited to the mining boundary for protecting the lake's ecological environment.

Download Full-text

Climatic and atmospheric indices teleconnection impact on the characteristics of frost season in western Iran

Journal of Water and Climate Change ◽

10.2166/wcc.2017.114 ◽

2017 ◽

Vol 10 (2) ◽

pp. 391-401

Author(s):

Zohreh Maryanaji ◽

Leili Tapak ◽

Omid Hamidi

Keyword(s):

Ocean Circulation ◽

El Niño ◽

Large Scale ◽

El Nino ◽

Southern Oscillation ◽

Support Vector ◽

Western Iran ◽

Global Indices ◽

Highly Correlated ◽

The Impact

Abstract The large-scale variability of atmospheric and ocean circulation patterns cause seasonal climate changes in the Earth. In other words, climate elements are affected by phenomena like El Niño Southern Oscillation (ENSO), El Niño (NINO), and Northern Atlantic Oscillation (NAO). In this study the characteristics of the frost season over a 20-year period (1996–2015) from seven synoptic stations in western Iran were evaluated using support vector machine and random forest regression. Comparing determination coefficients obtained by these models between atmospheric and ocean circulation indices and the characteristics of the frost season showed a positive effect. Thus, the onset and the end of the frost season in this region were highly correlated with the Southern Oscillation Index (SOI) and NAO, respectively. In regions with lower correlations (central areas and some regions of Alvand Mountain), the role of the geographical factors, altitude and topography becomes more pronounced and the impact of the global indices is reduced. Cluster analysis was also conducted to detect patterns and to identify regions according to the effect of the atmospheric and oceanic indices on frost season, and three regions were identified. The largest correlations with global indices (in both models) belonged to the first and third classes, respectively. The results of this study could be applied for planning environmental and agricultural activities.

Download Full-text

A Comparative Analysis of Time-frequency Feature Extraction Techniques for Large Scale Electroencephalogram Data

International Journal of Advanced Trends in Computer Science and Engineering ◽

10.30534/ijatcse/2021/031012021 ◽

2021 ◽

Vol 10 (1) ◽

pp. 14-24

Keyword(s):

Feature Extraction ◽

Large Scale ◽

Extraction Methods ◽

Kernel Functions ◽

The Body ◽

Support Vector ◽

Discrete Wavelet ◽

Eeg Signals ◽

Time Frequency ◽

The Impact

Recognition of human emotions is a fascinating research field that motivates many researchers to use various approaches, such as facial expression, speech or gesture of the body. Electroencephalogram (EEG) is another approach of recognizing human emotion through brain signals and has offered promising findings. Although EEG signals provide detail information on human emotional states, the analysis of non-linear and chaotic characteristics of EEG signals is a substantial problem. The main challenge remains in analyzing EEG signals to extract relevant features in order to achieve optimum classification performance. Various feature extraction methods have been developed by researchers, which mainly can be categorized under time, frequency or time-frequency based feature extraction methods. Yet, there are numerous setting that could affect the performance of any model. In this paper, we investigated the performance of Discrete Wavelet Transform (DWT) and Discrete Wavelet Packet Transform (DWPT), which are time-frequency domain methods using Support Vector Machine (SVM) and k-Nearest Neighbor (KNN) classification techniques. Different SVM kernel functions and distance metrics of KNN are tested in this study by using subject-dependent and subject -independent approaches. The experiment is implemented using publicly available DEAP dataset. The experimental results show that DWT is mostly suitable with weighted KNN classifier while DWPT reported better results when tested using Linear SVM classifier to accurately classify the EEG signals on subject-dependent approach. Consistent results are observed for DWT-KNN on subject-independent approach, however SVM works better in the setting of quadratic kernel functions. These results indicate that further investigation is significant to examine the impact of different setting of methods in analyzing large scale of EEG data

Download Full-text

A Semisupervised Feature Selection with Support Vector Machine

Journal of Applied Mathematics ◽

10.1155/2013/416320 ◽

2013 ◽

Vol 2013 ◽

pp. 1-11 ◽

Cited By ~ 3

Author(s):

Kun Dai ◽

Hong-Yi Yu ◽

Qing Li

Keyword(s):

Support Vector Machine ◽

Feature Selection ◽

Feature Selection Method ◽

Classification Performance ◽

Learning Problems ◽

Support Vector ◽

Data Sets ◽

Alternating Direction ◽

Optimal Classification ◽

Highly Correlated

Feature selection has proved to be a beneficial tool in learning problems with the main advantages of interpretation and generalization. Most existing feature selection methods do not achieve optimal classification performance, since they neglect the correlations among highly correlated features which all contribute to classification. In this paper, a novel semisupervised feature selection algorithm based on support vector machine (SVM) is proposed, termed SENFS. In order to solve SENFS, an efficient algorithm based on the alternating direction method of multipliers is then developed. One advantage of SENFS is that it encourages highly correlated features to be selected or removed together. Experimental results demonstrate the effectiveness of our feature selection method on simulation data and benchmark data sets.

Download Full-text

TRAINING SUPPORT VECTOR MACHINES USING FRANK–WOLFE OPTIMIZATION METHODS

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001413600033 ◽

2013 ◽

Vol 27 (03) ◽

pp. 1360003 ◽

Cited By ~ 6

Author(s):

EMANUELE FRANDI ◽

RICARDO ÑANCULEF ◽

MARIA GRAZIA GASPARO ◽

STEFANO LODI ◽

CLAUDIO SARTORI

Keyword(s):

Large Scale ◽

Binary Classification ◽

Feature Space ◽

Optimization Methods ◽

Learning Problems ◽

Support Vector ◽

Fast Method ◽

Training Support ◽

Lower Accuracy ◽

Vector Machines

Training a support vector machine (SVM) requires the solution of a quadratic programming problem (QP) whose computational complexity becomes prohibitively expensive for large scale datasets. Traditional optimization methods cannot be directly applied in these cases, mainly due to memory restrictions. By adopting a slightly different objective function and under mild conditions on the kernel used within the model, efficient algorithms to train SVMs have been devised under the name of core vector machines (CVMs). This framework exploits the equivalence of the resulting learning problem with the task of building a minimal enclosing ball (MEB) problem in a feature space, where data is implicitly embedded by a kernel function. In this paper, we improve on the CVM approach by proposing two novel methods to build SVMs based on the Frank–Wolfe algorithm, recently revisited as a fast method to approximate the solution of a MEB problem. In contrast to CVMs, our algorithms do not require to compute the solutions of a sequence of increasingly complex QPs and are defined by using only analytic optimization steps. Experiments on a large collection of datasets show that our methods scale better than CVMs in most cases, sometimes at the price of a slightly lower accuracy. As CVMs, the proposed methods can be easily extended to machine learning problems other than binary classification. However, effective classifiers are also obtained using kernels which do not satisfy the condition required by CVMs, and thus our methods can be used for a wider set of problems.

Download Full-text

A FAST SVM TRAINING ALGORITHM

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001403002423 ◽

2003 ◽

Vol 17 (03) ◽

pp. 367-384 ◽

Cited By ~ 22

Author(s):

JIAN-XIONG DONG ◽

CHING Y. SUEN ◽

ADAM KRZYŻAK

Keyword(s):

Data Mining ◽

Principal Component Analysis ◽

Support Vector Machine ◽

Large Scale ◽

Principal Component ◽

Learning Problems ◽

Support Vector ◽

Training Algorithm ◽

Test Set ◽

Handwritten Digit

A fast support vector machine (SVM) training algorithm is proposed under SVM's decomposition framework by effectively integrating kernel caching, digest and shrinking policies and stopping conditions. Kernel caching plays a key role in reducing the number of kernel evaluations by maximal reusage of cached kernel elements. Extensive experiments have been conducted on a large handwritten digit database MNIST to show that the proposed algorithm is much faster than Keerthi et al.'s improved SMO, about nine times. Combined with principal component analysis, the total training for ten one-against-the-rest classifiers on MNIST took less than an hour. Moreover, the proposed fast algorithm speeds up SVM training without sacrificing the generalization performance. The 0.6% error rate on MNIST test set has been achieved. The promising scalability of the proposed scheme paves a new way to solve more large-scale learning problems in other domains such as data mining.

Download Full-text