scholarly journals Authorship Attribution via Network Motifs Identification

Author(s):  
Vanessa Queiroz Marinho ◽  
Graeme Hirst ◽  
Diego Raphael Amancio
2015 ◽  
Author(s):  
Upendra Sapkota ◽  
Steven Bethard ◽  
Manuel Montes ◽  
Thamar Solorio

2019 ◽  
Vol 16 (5) ◽  
pp. 392-401
Author(s):  
Shengli Zhang ◽  
Zekun Tong ◽  
Haoyu Yin ◽  
Yifan Feng

Background: Finding the pathogenic gene is very important for understanding the pathogenesis of the disease, locating effective drug targets and improving the clinical level of medical treatment. However, the existing methods for finding the pathogenic genes still have limitations, for instance the computational complexity is high, and the combination of multiple genes and pathways has not been considered to search for highly related pathogenic genes and so on. Methods: We propose a pathogenic genes selection model of genetic disease based on Network Motifs Slicing Feedback (NMSF). We find a point set which makes the conductivity of the motif minimum then use it to substitute for the original gene pathway network. Based on the NMSF, we propose a new pathogenic genes selection model to expand pathogenic gene set. Results: According to the gene set we have obtained, selection of key genes will be more accurate and convincing. Finally, we use our model to screen the pathogenic genes and key pathways of liver cancer and lung cancer, and compare the results with the existing methods. Conclusion: The main contribution is to provide a method called NMSF which simplifies the gene pathway network to make the selection of pathogenic gene simple and feasible. The fact shows our result has a wide coverage and high accuracy and our model has good expeditiousness and robustness.


2019 ◽  
Vol 35 (4) ◽  
pp. 812-825 ◽  
Author(s):  
Robert Gorman

Abstract How to classify short texts effectively remains an important question in computational stylometry. This study presents the results of an experiment involving authorship attribution of ancient Greek texts. These texts were chosen to explore the effectiveness of digital methods as a supplement to the author’s work on text classification based on traditional stylometry. Here it is crucial to avoid confounding effects of shared topic, etc. Therefore, this study attempts to identify authorship using only morpho-syntactic data without regard to specific vocabulary items. The data are taken from the dependency annotations published in the Ancient Greek and Latin Dependency Treebank. The independent variables for classification are combinations generated from the dependency label and the morphology of each word in the corpus and its dependency parent. To avoid the effects of the combinatorial explosion, only the most frequent combinations are retained as input features. The authorship classification (with thirteen classes) is done with standard algorithms—logistic regression and support vector classification. During classification, the corpus is partitioned into increasingly smaller ‘texts’. To explore and control for the possible confounding effects of, e.g. different genre and annotator, three corpora were tested: a mixed corpus of several genres of both prose and verse, a corpus of prose including oratory, history, and essay, and a corpus restricted to narrative history. Results are surprisingly good as compared to those previously published. Accuracy for fifty-word inputs is 84.2–89.6%. Thus, this approach may prove an important addition to the prevailing methods for small text classification.


2019 ◽  
Vol 11 (9) ◽  
pp. 2484 ◽  
Author(s):  
Ying Jin ◽  
Ye Wei ◽  
Chunliang Xiu ◽  
Wei Song ◽  
Kaixian Yang

The air passenger transport network system is an important agent of social and economic connections between cities. Studying on the airline network structure and providing optimization strategies can improve the airline industry sustainability evolution. As basic building blocks of broad networks, the concept of network motifs is cited in this paper to apply to the structural characteristic analysis of the passenger airline network. The ENUMERATE SUBGRAPHS (G, k) algorithm is used to identify the motifs and anti-motifs of the passenger airline network in China. A total of 37 airline companies are subjected to motif identification and exploring the structural and functional characteristics of the airline networks corresponding to different motifs. These 37 airline companies are classified according to the motif concentration curves into three development stages, which include mono-centric divergence companies at the low-level development stage, transitional companies at the intermediate development stage, and multi-centric and hierarchical companies at the advanced development stage. Finally, we found that adjusting the number of proper network motifs is useful to optimize the overall structure of airline networks, which is profitable for air transport sustainable development.


Sign in / Sign up

Export Citation Format

Share Document