Maintaining range trees in secondary memory

Background: DNA and Protein sequences of an organism contain a variety of repeated structures of various types. These repeated structures play an important role in Molecular biology as they are related to genetic backgrounds of inherited diseases. They also serve as a marker for DNA mapping and DNA fingerprinting. Efficient searching of maximal and super maximal repeats in DNA/Protein sequences can lead to many other applications in the area of genomics. Moreover, these repeats can also be used for identification of critical diseases by finding the similarity between frequency distributions of repeats in viruses and genomes (without using alignment algorithms). Objective: The study aims to develop an efficient tool for searching maximal and super maximal repeats in large DNA/Protein sequences. Methods: The proposed tool uses a newly introduced data structure Induced Enhanced Suffix Array (IESA). IESA is an extension of enhanced suffix array. It uses induced suffix array instead of classical suffix array. IESA consists of Induced Suffix Array (ISA) and an additional array-Longest Common Prefix (LCP) array. ISA is an array of all sorted suffixes of the input sequence while LCP array stores the lengths of the longest common prefixes between all pairs of consecutive suffixes in an induced suffix array. IESA is known to be efficient w.r.t. both time and space. It facilitates the use of secondary memory for constructing the large suffix-array. Results: An open source standalone tool named MSR-IESA for searching maximal and super maximal repeats in DNA/Protein sequences is provided at https://github.com/sanjeevalg/MSRIESA. Experimental results show that the proposed algorithm outperforms other state of the art works w.r.t. to both time and space. Conclusion: The proposed tool MSR-IESA is remarkably efficient for the analysis of DNA/Protein sequences, having maximal and super maximal repeats of any length. It can be used for identification of well-known diseases.

Download Full-text

AN EXTENDED FUZZY CLUSTERING ALGORITHM AND ITS APPLICATION

Journal of Circuits System and Computers ◽

10.1142/s0218126695000175 ◽

1995 ◽

Vol 05 (02) ◽

pp. 239-259

Author(s):

SU HWAN KIM ◽

SEON WOOK KIM ◽

TAE WON RHEE

Keyword(s):

Fuzzy Clustering ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Main Memory ◽

Color Image Segmentation ◽

Occurrence Rate ◽

Secondary Memory ◽

Worst Case ◽

Memory Space ◽

Fuzzy Clustering Algorithm

For data analyses, it is very important to combine data with similar attribute values into a categorically homogeneous subset, called a cluster, and this technique is called clustering. Generally crisp clustering algorithms are weak in noise, because each datum should be assigned to exactly one cluster. In order to solve the problem, a fuzzy c-means, a fuzzy maximum likelihood estimation, and an optimal fuzzy clustering algorithms in the fuzzy set theory have been proposed. They, however, require a lot of processing time because of exhaustive iteration with an amount of data and their memberships. Especially large memory space results in the degradation of performance in real-time processing applications, because it takes too much time to swap between the main memory and the secondary memory. To overcome these limitations, an extended fuzzy clustering algorithm based on an unsupervised optimal fuzzy clustering algorithm is proposed in this paper. This algorithm assigns a weight factor to each distinct datum considering its occurrence rate. Also, the proposed extended fuzzy clustering algorithm considers the degree of importances of each attribute, which determines the characteristics of the data. The worst case is that the whole data has an uniformly normal distribution, which means the importance of all attributes are the same. The proposed extended fuzzy clustering algorithm has better performance than the unsupervised optimal fuzzy clustering algorithm in terms of memory space and execution time in most cases. For simulation the proposed algorithm is applied to color image segmentation. Also automatic target detection and multipeak detection are considered as applications. These schemes can be applied to any other fuzzy clustering algorithms.

Download Full-text

Indexing moving objects: A real time approach

Computer Science and Information Systems ◽

10.2298/csis111127040l ◽

2013 ◽

Vol 10 (1) ◽

pp. 173-195 ◽

Cited By ~ 1

Author(s):

George Lagogiannis ◽

Nikos Lorentzos ◽

Alexander Sideridis

Keyword(s):

Real Time ◽

Moving Objects ◽

Optimal Number ◽

Primary Memory ◽

Secondary Memory ◽

Current Position ◽

Asymptotically Optimal ◽

New Approaches

Indexing moving objects usually involves a great amount of updates, caused by objects reporting their current position. In order to keep the present and past positions of the objects in secondary memory, each update introduces an I/O and this process is sometimes creating a bottleneck. In this paper we deal with the problem of minimizing the number of I/Os in such a way that queries concerning the present and past positions of the objects can be answered efficiently. In particular we propose two new approaches that achieve an asymptotically optimal number of I/Os for performing the necessary updates. The approaches are based on the assumption that the primary memory suffices for storing the current positions of the objects.

Download Full-text

Design and Analysis of Spatial Skyline Queries Indexing

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.d1006.0394s220 ◽

2020 ◽

Vol 9 (4S2) ◽

pp. 23-29

Keyword(s):

Current Practice ◽

Dimensional Space ◽

Information Age ◽

Secondary Memory ◽

Skyline Query ◽

Skyline Queries ◽

Keyword Query ◽

Spatial Keyword Query ◽

Nearest Points ◽

The Ideal

Dwelling in the information age permits nearly everybody needs to recover countless information and choices to gather from to fulfill their necessities. In distinctive cases, the quantity of information accessible and the speed of change may cover the ideal and required explanation. Spatial-textual queries provide the most acclaimed nearest points concerning a conveyed site and a keyword set. Current practice regularly thought on the most capable technique to expertly get the top-k resultset reestablished a spatial-scholarly query. A capable Spatial Range Skyline Query (SRSQ) algorithm is proposed which initially performsa spatial keyword query (SKQ) that relies upon an IRtree that documents the information. Skyline centers picked are not simply established on their partitions to a lot of inquiries and more subject to their significance to a social occasion of query keywords. Additionally, besides proposed range skyline (RS) methods based on R-tree multi-dimensional space including secondary- memory pruning tools for operating field skyline queries is accomplished. The advanced scheme is dynamic and I/O optimum. Ultimately, methodology presents a modern assessment that demonstrates the proficiency.

Download Full-text

Hippocampal formation size in normal human aging: a correlate of delayed secondary memory performance.

Learning & Memory ◽

10.1101/lm.1.1.45 ◽

1994 ◽

Vol 1 (1) ◽

pp. 45-54

Author(s):

J Golomb ◽

A Kluger ◽

M J de Leon ◽

S H Ferris ◽

A Convit ◽

...

Keyword(s):

Hippocampal Formation ◽

Control Measure ◽

Digit Span ◽

Memory Performance ◽

Memory Loss ◽

Cerebral Atrophy ◽

Secondary Memory ◽

Elderly Persons ◽

Human Aging ◽

Normal Human

Although mild progressive memory impairment is commonly associated with normal human aging, it is unclear whether this phenomenon can be explained by specific structural brain changes. In a research sample of 54 medically healthy and cognitively normal elderly persons (ages 55-87, x = 69.0 +/- 7.9), magnetic resonance imaging (MRI) was used to derive head-size-adjusted measurements of the hippocampal formation (HF) (dentate gyrus, hippocampus proper, alveus, fimbria, subiculum), the superior temporal gyrus (STG), and the subarachnoid cerebrospinal fluid (CSF) (to estimate generalized cerebral atrophy). Subjects were administered tests of primary memory (digit span) and tests of secondary memory with immediate and delayed recall components (paragraph, paired associate, list recall; facial recognition). Separate composite scores for the immediate and delayed components were created by combining, with equal weighting, the subtests of each category. The WAIS vocabulary subtest was used as a control measure for language and intelligence. A highly significant correlation (P < 0.001), independent of age, gender, and generalized cerebral atrophy was found between HF size and delayed memory performance. No significant correlations were found between HF size and primary or immediate memory performance. STG size was not significantly correlated with any of the composite memory variables. These results suggest that HF atrophy may play an important independent role in contributing to the memory loss experienced by many aging adults.

Download Full-text