Efficiently Detecting Frequent Patterns in Biological Sequences

Author(s):  
Wei Liu ◽  
Ling Chen
Symmetry ◽  
2020 ◽  
Vol 12 (12) ◽  
pp. 2090
Author(s):  
Yue Lu ◽  
Long Zhao ◽  
Zhao Li ◽  
Xiangjun Dong

Similarity analysis of DNA sequences can clarify the homology between sequences and predict the structure of, and relationship between, them. At the same time, the frequent patterns of biological sequences explain not only the genetic characteristics of the organism, but they also serve as relevant markers for certain events of biological sequences. However, most of the aforementioned biological sequence similarity analysis methods are targeted at the entire sequential pattern, which ignores the missing gene fragment that may induce potential disease. The similarity analysis of such sequences containing a missing gene item is a blank. Consequently, some sequences with missing bases are ignored or not effectively analyzed. Thus, this paper presents a new method for DNA sequence similarity analysis. Using this method, we first mined not only positive sequential patterns, but also sequential patterns that were missing some of the base terms (collectively referred to as negative sequential patterns). Subsequently, we used these frequent patterns for similarity analysis on a two-dimensional plane. Several experiments were conducted in order to verify the effectiveness of this algorithm. The experimental results demonstrated that the algorithm can obtain various results through the selection of frequent sequential patterns and that accuracy and time efficiency was improved.


2019 ◽  
Vol 14 (4) ◽  
pp. 574-589
Author(s):  
Linyan Xue ◽  
Xiaoke Zhang ◽  
Fei Xie ◽  
Shuang Liu ◽  
Peng Lin

In the application of bioinformatics, the existing algorithms cannot be directly and efficiently implement sequence pattern mining. Two fast and efficient biological sequence pattern mining algorithms for biological single sequence and multiple sequences are proposed in this paper. The concept of the basic pattern is proposed, and on the basis of mining frequent basic patterns, the frequent pattern is excavated by constructing prefix trees for frequent basic patterns. The proposed algorithms implement rapid mining of frequent patterns of biological sequences based on pattern prefix trees. In experiment the family sequence data in the pfam protein database is used to verify the performance of the proposed algorithm. The prediction results confirm that the proposed algorithms can’t only obtain the mining results with effective biological significance, but also improve the running time efficiency of the biological sequence pattern mining.


Author(s):  
EL-Mehdi Ali ◽  
Yan-Lin He ◽  
QunXiong Zhu
Keyword(s):  

2010 ◽  
Vol 36 (5) ◽  
pp. 674-684 ◽  
Author(s):  
Feng WU ◽  
Yan ZHONG ◽  
Quan-Yuan WU

2018 ◽  
Vol 25 (9) ◽  
pp. 822-829 ◽  
Author(s):  
Wei Zhao ◽  
Likun Wang ◽  
Tian-Xiang Zhang ◽  
Ze-Ning Zhao ◽  
Pu-Feng Du

Sign in / Sign up

Export Citation Format

Share Document