SparkSW: Scalable Distributed Computing System for Large-Scale Biological Sequence Alignment

This teaching case describes the efforts of one department in a large organization to migrate from an internally developed, mainframe-based, computing system to a system based on purchased software running on a client/server architecture. The case highlights issues with large scale software implementations such as those demanded by enterprise resource package (ERP) installations. Often, the ERP selected by an organization does not have all the required functionality. This demands purchasing and installing additional packages (known colloquially as “bolt-ons”) to provide the needed functionality. These implementations lead to issues regarding oversight of the technical architecture, both project and technology governance, and user department capability for managing the installation of new systems.

Download Full-text

POSTER: BioSEAL: In-Memory Biological Sequence Alignment Accelerator for Large-Scale Genomic Data

2019 28th International Conference on Parallel Architectures and Compilation Techniques (PACT) ◽

10.1109/pact.2019.00044 ◽

2019 ◽

Cited By ~ 4

Author(s):

Roman Kaplan ◽

Leonid Yavits ◽

Ran Ginosar

Keyword(s):

Sequence Alignment ◽

Large Scale ◽

Genomic Data ◽

Biological Sequence

Download Full-text

A Lightweight Distributed Computing System Based on JavaScript Technology

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.34-35.1911 ◽

2010 ◽

Vol 34-35 ◽

pp. 1911-1915

Author(s):

Jun Tang

Keyword(s):

Distributed Computing ◽

Information Exchange ◽

Large Scale ◽

Computing System ◽

The Internet ◽

Computational Problem ◽

Intermediary Role ◽

Computational Platform ◽

It Expert ◽

The Web

Because the web is not only the platform for information exchange but also the computational platform based on JavaScript engine, every computer having installed modern browser on the Internet can easily access the web and execute some JavaScript programs. Under above conditions, we develop a lightweight distributed computing system based on the web and JavaScript technologies. Our system plays an intermediary role between the IT expert who has to solve large-scale computational problem and end users on the Internet. In the other words, people could easily cooperate with each other to finish complicated computational problem through the support of our system.

Download Full-text

Privacy Enhanced Data Security Mechanism in a Large-Scale Distributed Computing System for HTC and MTC

International Journal of Contents ◽

10.5392/ijoc.2016.12.2.006 ◽

2016 ◽

Vol 12 (2) ◽

pp. 6-11 ◽

Cited By ~ 1

Author(s):

Seungwoo Rho ◽

Sangbae Park ◽

Soonwook Hwang

Keyword(s):

Distributed Computing ◽

Data Security ◽

Large Scale ◽

Computing System ◽

Security Mechanism

Download Full-text

Deploying Distributed Computing

Cases on Information Technology Series - Annals of Cases on Information Technology ◽

10.4018/978-1-61520-593-6.ch002 ◽

2000 ◽

pp. 24-38

Author(s):

Steve Sawyer ◽

William Gibbons

Keyword(s):

Distributed Computing ◽

Large Scale ◽

Computing System ◽

Large Organization ◽

Client Server ◽

Teaching Case ◽

Technology Governance ◽

Technical Architecture ◽

Software Implementations ◽

Server Architecture

This teaching case describes the efforts of one department in a large organization to migrate from an internally developed, mainframe-based, computing system to a system based on purchased software running on a client/server architecture. The case highlights issues with large scale software implementations such as those demanded by enterprise resource package (ERP) installations. Often, the ERP selected by an organization does not have all the required functionality. This demands purchasing and installing additional packages (known colloquially as bolt-ons) to provide the needed functionality. These implementations lead to issues regarding oversight of the technical architecture, both project and technology governance, and user department capability for managing the installation of new systems.

Download Full-text

ZRPC: a distributed computing system for large scale 3D reconstruction

10.14711/thesis-b1628170 ◽

2016 ◽

Author(s):

Zuozhuo Dai

Keyword(s):

Distributed Computing ◽

3D Reconstruction ◽

Large Scale ◽

Computing System

Download Full-text

Building a Distributed Computing System for LDMX

EPJ Web of Conferences ◽

10.1051/epjconf/202125102038 ◽

2021 ◽

Vol 251 ◽

pp. 02038

Author(s):

Lene Kristian Bryngemark ◽

David Cameron ◽

Valentina Dutta ◽

Thomas Eichlersmith ◽

Balazs Konya ◽

...

Keyword(s):

Dark Matter ◽

Distributed Computing ◽

Particle Physics ◽

Large Scale ◽

Research Collaboration ◽

Pilot Project ◽

Computing System ◽

Mass Region ◽

Small Scale ◽

Component Integration

Particle physics experiments rely extensively on computing and data services, making e-infrastructure an integral part of the research collaboration. Constructing and operating distributed computing can however be challenging for a smaller-scale collaboration. The Light Dark Matter eXperiment (LDMX) is a planned small-scale accelerator-based experiment to search for dark matter in the sub-GeV mass region. Finalizing the design of the detector relies on Monte-Carlo simulation of expected physics processes. A distributed computing pilot project was proposed to better utilize available resources at the collaborating institutes, and to improve scalability and reproducibility. This paper outlines the chosen lightweight distributed solution, presenting requirements, the component integration steps, and the experiences using a pilot system for tests with large-scale simulations. The system leverages existing technologies wherever possible, minimizing the need for software development, and deploys only non-intrusive components at the participating sites. The pilot proved that integrating existing components can dramatically reduce the effort needed to build and operate a distributed e-infrastructure, making it attainable even for smaller research collaborations.

Download Full-text

Parallel algorithms for large-scale biological sequence alignment on Xeon-Phi based clusters

BMC Bioinformatics ◽

10.1186/s12859-016-1128-0 ◽

2016 ◽

Vol 17 (S9) ◽

Cited By ~ 4

Author(s):

Haidong Lan ◽

Yuandong Chan ◽

Kai Xu ◽

Bertil Schmidt ◽

Shaoliang Peng ◽

...

Keyword(s):

Parallel Algorithms ◽

Sequence Alignment ◽

Large Scale ◽

Xeon Phi ◽

Biological Sequence

Download Full-text

Parallel Nonnegative Matrix Factorization with Manifold Regularization

Journal of Electrical and Computer Engineering ◽

10.1155/2018/6270816 ◽

2018 ◽

Vol 2018 ◽

pp. 1-10

Author(s):

Fudong Liu ◽

Zheng Shan ◽

Yihang Chen

Keyword(s):

Distributed Computing ◽

Matrix Factorization ◽

Large Scale ◽

Nonnegative Matrix Factorization ◽

Nonnegative Matrix ◽

Computing System ◽

Construction Method ◽

Manifold Regularization ◽

Single Node ◽

Text Corpora

Nonnegative matrix factorization (NMF) decomposes a high-dimensional nonnegative matrix into the product of two reduced dimensional nonnegative matrices. However, conventional NMF neither qualifies large-scale datasets as it maintains all data in memory nor preserves the geometrical structure of data which is needed in some practical tasks. In this paper, we propose a parallel NMF with manifold regularization method (PNMF-M) to overcome the aforementioned deficiencies by parallelizing the manifold regularized NMF on distributed computing system. In particular, PNMF-M distributes both data samples and factor matrices to multiple computing nodes instead of loading the whole dataset in a single node and updates both factor matrices locally on each node. In this way, PNMF-M succeeds to resolve the pressure of memory consumption for large-scale datasets and to speed up the computation by parallelization. For constructing the adjacency matrix in manifold regularization, we propose a two-step distributed graph construction method, which is proved to be equivalent to the batch construction method. Experimental results on popular text corpora and image datasets demonstrate that PNMF-M significantly improves both scalability and time efficiency of conventional NMF thanks to the parallelization on distributed computing system; meanwhile it significantly enhances the representation ability of conventional NMF thanks to the incorporated manifold regularization.

Download Full-text

Aligning Multiple Sequences Using an Improved Tabu Search Algorithm

Journal of Circuits System and Computers ◽

10.1142/s0218126617500669 ◽

2016 ◽

Vol 26 (04) ◽

pp. 1750066 ◽

Cited By ~ 1

Author(s):

Lamiche Chaabane ◽

Moussaoui Abdelouahab

Keyword(s):

Tabu Search ◽

Sequence Alignment ◽

Dna Sequences ◽

Multiple Sequence Alignment ◽

Large Scale ◽

Search Algorithm ◽

Protein Structures ◽

Biological Sequence ◽

Multiple Sequence ◽

Alignment Problem

One of the most essential operations in biological sequence analysis is multiple sequence alignment (MSA), where it is used for constructing evolutionary trees for DNA sequences and for analyzing the protein structures to help design new proteins. In this research study, a new method for solving sequence alignment problem is proposed, which is named improved tabu search (ITS). This algorithm is based on the classical tabu search (TS) optimizing technique. ITS is implemented in order to obtain results of multiple sequence alignment. Several variants concerning neighborhood generation and intensification/diversification strategies for our proposed ITS are investigated. Simulation results on a large scale of datasets have shown the efficacy of the developed approach and its capacity to achieve good quality solutions in terms of scores comparing to those given by other existing methods.

Download Full-text