scholarly journals Combining complex networks and data mining: why and how

2016 ◽  
Author(s):  
M. Zanin ◽  
D. Papo ◽  
P. A. Sousa ◽  
E. Menasalvas ◽  
A. Nicchi ◽  
...  

AbstractThe increasing power of computer technology does not dispense with the need to extract meaningful in-formation out of data sets of ever growing size, and indeed typically exacerbates the complexity of this task. To tackle this general problem, two methods have emerged, at chronologically different times, that are now commonly used in the scientific community: data mining and complex network theory. Not only do complex network analysis and data mining share the same general goal, that of extracting information from complex systems to ultimately create a new compact quantifiable representation, but they also often address similar problems too. In the face of that, a surprisingly low number of researchers turn out to resort to both methodologies. One may then be tempted to conclude that these two fields are either largely redundant or totally antithetic. The starting point of this review is that this state of affairs should be put down to contingent rather than conceptual differences, and that these two fields can in fact advantageously be used in a synergistic manner. An overview of both fields is first provided, some fundamental concepts of which are illustrated. A variety of contexts in which complex network theory and data mining have been used in a synergistic manner are then presented. Contexts in which the appropriate integration of complex network metrics can lead to improved classification rates with respect to classical data mining algorithms and, conversely, contexts in which data mining can be used to tackle important issues in complex network theory applications are illustrated. Finally, ways to achieve a tighter integration between complex networks and data mining, and open lines of research are discussed.

F1000Research ◽  
2016 ◽  
Vol 5 ◽  
pp. 2675 ◽  
Author(s):  
Massimiliano Zanin

Complex network theory has been used, during the last decade, to understand the structures behind complex biological problems, yielding new knowledge in a large number of situations. Nevertheless, such knowledge has remained mostly qualitative. In this contribution, I show how information extracted from a network representation can be used in a quantitative way, to improve the score of a classification task. As a test bed, I consider a dataset corresponding to patients suffering from prostate cancer, and the task of successfully prognosing their survival. When information from a complex network representation is added on top of a simple classification model, the error is reduced from 27.9% to 23.8%. This confirms that network theory can be used to synthesize information that may not readily be accessible by standard data mining algorithms.


2013 ◽  
Vol 433-435 ◽  
pp. 788-792
Author(s):  
Xian Xia Qiao ◽  
Ying Ding Zhao

The purpose of our research on gene expression regulation network that Is expected to fully disclose the genome from the perspective of system functionality and behavior, In essence, the Gene expression regulation network is a complex network, We study gene expression regulation network by the complex network theory, network properties of Research the network as average path length, Focus coefficient, Degree distribution, Scale-free feature and Small world effect ,etc, the Biological significance of the research is that Trying to complex systems point of view as a starting point , the network topology structure of interaction between them from the Angle of the relationship between genes, proteins, Reveals the complex mechanism and functional information.


2014 ◽  
Vol 2014 ◽  
pp. 1-11 ◽  
Author(s):  
Si-hua Chen ◽  
Wei He

As platform based on users’ relationship to acquire, share, and propagate knowledge, Wechat develops very rapidly and becomes an important channel to spread knowledge. This new way to propagate knowledge is quite different from the traditional media way which enables knowledge to be spread surprisingly in Wechat. Based on complex network theory and the analysis of the factors which influence the knowledge propagation in Wechat, this paper summarizes the behavior preferences of Wechat users in knowledge propagation and establishes a Wechat knowledge propagation model. By the simulation experiment, this paper tests the model established and finds some important thresholds in knowledge propagation in Wechat. The findings are valuable for further studying the knowledge propagation in Wechat and provide theoretical proof for forecasting the scale and influence of knowledge propagation.


Author(s):  
Till Becker ◽  
Mirja Meyer ◽  
Katja Windt

Purpose – The topology of manufacturing systems is specified during the design phase and can afterwards only be adjusted at high expense. The purpose of this paper is to exploit the availability of large-scale data sets in manufacturing by applying measures from complex network theory and from classical performance evaluation to investigate the relation between structure and performance. Design/methodology/approach – The paper develops a manufacturing system network model that is composed of measures from complex network theory. The analysis is based on six company data sets containing up to half a million operation records. The paper uses the network model as a straightforward approach to assess the manufacturing systems and to evaluate the impact of topological measures on fundamental performance figures, e.g., work in process or lateness. Findings – The paper able to show that the manufacturing systems network model is a low-effort approach to quickly assess a manufacturing system. Additionally, the paper demonstrates that manufacturing networks display distinct, non-random network characteristics on a network-wide scale and that the relations between topological and performance key figures are non-linear. Research limitations/implications – The sample consists of six data sets from Germany-based manufacturing companies. As the model is universal, it can easily be applied to further data sets from any industry. Practical implications – The model can be utilized to quickly analyze large data sets without employing classical methods (e.g. simulation studies) which require time-intensive modeling and execution. Originality/value – This paper explores for the first time the application of network figures in manufacturing systems in relation to performance figures by using real data from manufacturing companies.


Author(s):  
Shuang Song ◽  
Dawei Xu ◽  
Shanshan Hu ◽  
Mengxi Shi

Habitat destruction and declining ecosystem service levels caused by urban expansion have led to increased ecological risks in cities, and ecological network optimization has become the main way to resolve this contradiction. Here, we used landscape patterns, meteorological and hydrological data as data sources, applied the complex network theory, landscape ecology, and spatial analysis technology, a quantitative analysis of the current state of landscape pattern characteristics in the central district of Harbin was conducted. The minimum cumulative resistance was used to extract the ecological network of the study area. Optimized the ecological network by edge-adding of the complex network theory, compared the optimizing effects of different edge-adding strategies by using robustness analysis, and put forward an effective way to optimize the ecological network of the study area. The results demonstrate that: The ecological patches of Daowai, Xiangfang, Nangang, and other old districts in the study area are small in size, fewer in number, strongly fragmented, with a single external morphology, and high internal porosity. While the ecological patches in the new districts of Songbei, Hulan, and Acheng have a relatively good foundation. And ecological network connectivity in the study area is generally poor, the ecological corridors are relatively sparse and scattered, the connections between various ecological sources of the corridors are not close. Comparing different edge-adding strategies of complex network theory, the low-degree-first strategy has the most outstanding performance in the robustness test. The low-degree-first strategy was used to optimize the ecological network of the study area, 43 ecological corridors are added. After the optimization, the large and the small ecological corridors are evenly distributed to form a complete network, the optimized ecological network will be significantly more connected, resilient, and resistant to interference, the ecological flow transmission will be more efficient.


Sign in / Sign up

Export Citation Format

Share Document