Content-Based Image Retrial Based on Hadoop

Solving Large-Scale TSP Using a Fast Wedging Insertion Partitioning Approach

Mathematical Problems in Engineering ◽

10.1155/2015/854218 ◽

2015 ◽

Vol 2015 ◽

pp. 1-8 ◽

Cited By ~ 6

Author(s):

Zuoyong Xiang ◽

Zhenyu Chen ◽

Xingyu Gao ◽

Xinjun Wang ◽

Fangchun Di ◽

...

Keyword(s):

Traveling Salesman Problem ◽

Time Complexity ◽

Large Scale ◽

Construction Method ◽

Traveling Salesman ◽

Experimental Results ◽

Insertion Method ◽

Symmetric Traveling Salesman Problem ◽

Traditional Construction

A new partitioning method, called Wedging Insertion, is proposed for solving large-scale symmetric Traveling Salesman Problem (TSP). The idea of our proposed algorithm is to cut a TSP tour into four segments by nodes’ coordinate (not by rectangle, such as Strip, FRP, and Karp). Each node is located in one of their segments, which excludes four particular nodes, and each segment does not twist with other segments. After the partitioning process, this algorithm utilizes traditional construction method, that is, the insertion method, for each segment to improve the quality of tour, and then connects the starting node and the ending node of each segment to obtain the complete tour. In order to test the performance of our proposed algorithm, we conduct the experiments on various TSPLIB instances. The experimental results show that our proposed algorithm in this paper is more efficient for solving large-scale TSPs. Specifically, our approach is able to obviously reduce the time complexity for running the algorithm; meanwhile, it will lose only about 10% of the algorithm’s performance.

Download Full-text

Multi-path Coverage of all Final States for Model-Based Testing Theory using Spark In-memory Design

10.36227/techrxiv.13283477 ◽

2020 ◽

Author(s):

Wilfried Yves Hamilton Adoni ◽

Moez Krichen ◽

Tarik Nahhal ◽

Abdeltif Elbyed

Keyword(s):

Time Complexity ◽

Large Scale ◽

Computation Time ◽

Final States ◽

Memory Design ◽

Model Based ◽

Distributed Approach ◽

Distributed Framework ◽

Model Based Testing ◽

Testing Theory

This paper deals with an efficient and robust distributed framework for finite state machine coverage in the field model based testing theory. All final states coverage in large-scale automaton is inherently computing-intensive and memory exhausting with impractical time complexity because of an explosion of the number of states. Thus, it is important to propose a faster solution that reduces the time complexity by exploiting big data concept based on Spark RDD computation. To cope with this situation, we propose a parallel and distributed approach based on Spark in-memory design which exploits A* algorithm for optimal coverage. The experiments performed on multi-node cluster prove that the proposed framework achieves significant gain of the computation time.

Download Full-text

Multi-path Coverage of all Final States for Model-Based Testing Theory using Spark In-memory Design

10.36227/techrxiv.13283477.v1 ◽

2020 ◽

Author(s):

Wilfried Yves Hamilton Adoni ◽

Moez Krichen ◽

Tarik Nahhal ◽

Abdeltif Elbyed

Keyword(s):

Time Complexity ◽

Large Scale ◽

Computation Time ◽

Final States ◽

Memory Design ◽

Model Based ◽

Distributed Approach ◽

Distributed Framework ◽

Model Based Testing ◽

Testing Theory

This paper deals with an efficient and robust distributed framework for finite state machine coverage in the field model based testing theory. All final states coverage in large-scale automaton is inherently computing-intensive and memory exhausting with impractical time complexity because of an explosion of the number of states. Thus, it is important to propose a faster solution that reduces the time complexity by exploiting big data concept based on Spark RDD computation. To cope with this situation, we propose a parallel and distributed approach based on Spark in-memory design which exploits A* algorithm for optimal coverage. The experiments performed on multi-node cluster prove that the proposed framework achieves significant gain of the computation time.

Download Full-text

A distributed near-optimal LSH-based framework for privacy-preserving record linkage

Computer Science and Information Systems ◽

10.2298/csis140215040k ◽

2014 ◽

Vol 11 (2) ◽

pp. 745-763 ◽

Cited By ~ 10

Author(s):

Dimitrios Karapiperis ◽

Vassilios Verykios

Keyword(s):

Record Linkage ◽

Privacy Preserving ◽

Experimental Results ◽

Locality Sensitive Hashing ◽

Map Reduce ◽

Data Sets ◽

Commodity Hardware ◽

Distributed Framework

In this paper, we present a framework which relies on the Map/Reduce paradigm in order to distribute computations among underutilized commodity hardware resources uniformly, without imposing an extra overhead on the existing infrastructure. The volume of the distance computations, required for records comparison, is largely reduced by utilizing the so-called Locality-Sensitive Hashing technique, which is optimally tuned in order to avoid highly redundant computations. Experimental results illustrate the effectiveness of our distributed framework in finding the matched record pairs in voluminous data sets.

Download Full-text

Computational fluid dynamics of rectangular external loop airlift reactor

International Journal of Chemical Reactor Engineering ◽

10.1515/ijcre-2020-0009 ◽

2020 ◽

Vol 18 (5-6) ◽

Author(s):

Shivanand M. Teli ◽

Channamallikarjun S. Mathpati

Keyword(s):

Large Scale ◽

Lift Coefficient ◽

Volume Ratio ◽

Reynolds Stress Model ◽

Airlift Reactor ◽

Experimental Results ◽

Turbulent Dispersion ◽

Drag Forces ◽

External Loop ◽

Turbulent Models

AbstractThe novel design of a rectangular external loop airlift reactor is at present the most used large-scale reactor for microalgae culture. It has a unique future for a large surface to volume ratio for exposure of light radiation for photosynthesis reaction. The 3D simulations have been performed in rectangular EL-ALR. The Eulerian–Eulerian approach has been used with a dispersed gas phase for different turbulent models. The performance and applicability of different turbulent model’s i.e., K-epsilon standard, K-epsilon realizable, K-omega, and Reynolds stress model are used and compared with experimental results. All drag forces and non-drag forces (turbulent dispersion, virtual mass, and lift coefficient) are included in the model. The experimental values of overall gas hold-up and average liquid circulation velocity have been compared with simulation and literature results. It is seemed to give good agreements. For the different elevations in the downcomer section, liquid axial velocity, turbulent kinetic energy, and turbulent eddy dissipation experimental have been compared with different turbulent models. The K-epsilon Realizable model gives better prediction with experimental results.

Download Full-text

Evaluation of recent advances in recommender systems on Arabic content

Journal Of Big Data ◽

10.1186/s40537-021-00420-2 ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Mehdi Srifi ◽

Ahmed Oussous ◽

Ayoub Ait Lahcen ◽

Salma Mouline

Keyword(s):

Recommender Systems ◽

High Performance ◽

Large Scale ◽

State Of The Art ◽

Experimental Results ◽

Recent Advances ◽

Research Gap ◽

Text Preprocessing

AbstractVarious recommender systems (RSs) have been developed over recent years, and many of them have concentrated on English content. Thus, the majority of RSs from the literature were compared on English content. However, the research investigations about RSs when using contents in other languages such as Arabic are minimal. The researchers still neglect the field of Arabic RSs. Therefore, we aim through this study to fill this research gap by leveraging the benefit of recent advances in the English RSs field. Our main goal is to investigate recent RSs in an Arabic context. For that, we firstly selected five state-of-the-art RSs devoted originally to English content, and then we empirically evaluated their performance on Arabic content. As a result of this work, we first build four publicly available large-scale Arabic datasets for recommendation purposes. Second, various text preprocessing techniques have been provided for preparing the constructed datasets. Third, our investigation derived well-argued conclusions about the usage of modern RSs in the Arabic context. The experimental results proved that these systems ensure high performance when applied to Arabic content.

Download Full-text

Video copy detection based on Speeded Up Robust Features and Locality Sensitive Hashing

2010 IEEE International Conference on Automation and Logistics ◽

10.1109/ical.2010.5585375 ◽

2010 ◽

Cited By ~ 8

Author(s):

Zhijie Zhang ◽

Chongxiao Cao ◽

Ruijie Zhang ◽

Jianhua Zou

Keyword(s):

Locality Sensitive Hashing ◽

Video Copy Detection ◽

Copy Detection ◽

Speeded Up Robust Features

Download Full-text

Local Graph Edge Partitioning

ACM Transactions on Intelligent Systems and Technology ◽

10.1145/3466685 ◽

2021 ◽

Vol 12 (5) ◽

pp. 1-25

Author(s):

Shengwei Ji ◽

Chenyang Bu ◽

Lei Li ◽

Xindong Wu

Keyword(s):

Real World ◽

Graph Partitioning ◽

Large Scale ◽

Complete Information ◽

Local Information ◽

Experimental Results ◽

Two Stage ◽

Graph Computation ◽

Local Graph ◽

Edge Partitioning

Graph edge partitioning, which is essential for the efficiency of distributed graph computation systems, divides a graph into several balanced partitions within a given size to minimize the number of vertices to be cut. Existing graph partitioning models can be classified into two categories: offline and streaming graph partitioning models. The former requires global graph information during the partitioning, which is expensive in terms of time and memory for large-scale graphs. The latter creates partitions based solely on the received graph information. However, the streaming model may result in a lower partitioning quality compared with the offline model. Therefore, this study introduces a Local Graph Edge Partitioning model, which considers only the local information (i.e., a portion of a graph instead of the entire graph) during the partitioning. Considering only the local graph information is meaningful because acquiring complete information for large-scale graphs is expensive. Based on the Local Graph Edge Partitioning model, two local graph edge partitioning algorithms—Two-stage Local Partitioning and Adaptive Local Partitioning—are given. Experimental results obtained on 14 real-world graphs demonstrate that the proposed algorithms outperform rival algorithms in most tested cases. Furthermore, the proposed algorithms are proven to significantly improve the efficiency of the real graph computation system GraphX.

Download Full-text

An improved OPTICS clustering algorithm for discovering clusters with uneven densities

Intelligent Data Analysis ◽

10.3233/ida-205497 ◽

2021 ◽

Vol 25 (6) ◽

pp. 1453-1471

Author(s):

Chunhua Tang ◽

Han Wang ◽

Zhiwen Wang ◽

Xiangkun Zeng ◽

Huaran Yan ◽

...

Keyword(s):

Time Complexity ◽

Clustering Algorithm ◽

Nearest Neighbor ◽

Clustering Algorithms ◽

Substantial Improvement ◽

Experimental Results ◽

High Time ◽

Parameter Setting ◽

K Nearest Neighbor ◽

Density Based Clustering

Most density-based clustering algorithms have the problems of difficult parameter setting, high time complexity, poor noise recognition, and weak clustering for datasets with uneven density. To solve these problems, this paper proposes FOP-OPTICS algorithm (Finding of the Ordering Peaks Based on OPTICS), which is a substantial improvement of OPTICS (Ordering Points To Identify the Clustering Structure). The proposed algorithm finds the demarcation point (DP) from the Augmented Cluster-Ordering generated by OPTICS and uses the reachability-distance of DP as the radius of neighborhood eps of its corresponding cluster. It overcomes the weakness of most algorithms in clustering datasets with uneven densities. By computing the distance of the k-nearest neighbor of each point, it reduces the time complexity of OPTICS; by calculating density-mutation points within the clusters, it can efficiently recognize noise. The experimental results show that FOP-OPTICS has the lowest time complexity, and outperforms other algorithms in parameter setting and noise recognition.

Download Full-text

Accelerating large scale centroid-based clustering with locality sensitive hashing

2016 IEEE 32nd International Conference on Data Engineering (ICDE) ◽

10.1109/icde.2016.7498278 ◽

2016 ◽

Cited By ~ 1

Author(s):

Ryan McConville ◽

Xin Cao ◽

Weiru Liu ◽

Paul Miller

Keyword(s):

Large Scale ◽

Locality Sensitive Hashing

Download Full-text