Parallel computing for genome sequence processing

Briefings in Bioinformatics ◽

10.1093/bib/bbab070 ◽

2021 ◽

Author(s):

You Zou ◽

Yuejie Zhu ◽

Yaohang Li ◽

Fang-Xiang Wu ◽

Jianxin Wang

Keyword(s):

Parallel Computing ◽

Data Storage ◽

Genome Sequence ◽

High Performance ◽

Programming Model ◽

Algorithm Design ◽

Sequence Processing ◽

Genome Data ◽

Sequencing Technologies ◽

Single Nucleotide Polymorphism Calling

Abstract The rapid increase of genome data brought by gene sequencing technologies poses a massive challenge to data processing. To solve the problems caused by enormous data and complex computing requirements, researchers have proposed many methods and tools which can be divided into three types: big data storage, efficient algorithm design and parallel computing. The purpose of this review is to investigate popular parallel programming technologies for genome sequence processing. Three common parallel computing models are introduced according to their hardware architectures, and each of which is classified into two or three types and is further analyzed with their features. Then, the parallel computing for genome sequence processing is discussed with four common applications: genome sequence alignment, single nucleotide polymorphism calling, genome sequence preprocessing, and pattern detection and searching. For each kind of application, its background is firstly introduced, and then a list of tools or algorithms are summarized in the aspects of principle, hardware platform and computing efficiency. The programming model of each hardware and application provides a reference for researchers to choose high-performance computing tools. Finally, we discuss the limitations and future trends of parallel computing technologies.

Download Full-text

Human mitochondrial genome compression using machine learning techniques

Human Genomics ◽

10.1186/s40246-019-0225-3 ◽

2019 ◽

Vol 13 (S1) ◽

Cited By ~ 2

Author(s):

Rongjie Wang ◽

Tianyi Zang ◽

Yadong Wang

Keyword(s):

Machine Learning ◽

Mitochondrial Genome ◽

Data Storage ◽

Machine Learning Techniques ◽

Compression Method ◽

Genome Data ◽

Sequencing Technologies ◽

Learning Techniques ◽

Single Genome ◽

Human Mitochondrial Genome

Abstract Background In recent years, with the development of high-throughput genome sequencing technologies, a large amount of genome data has been generated, which has caused widespread concern about data storage and transmission costs. However, how to effectively compression genome sequences data remains an unsolved problem. Results In this paper, we propose a compression method using machine learning techniques (DeepDNA), for compressing human mitochondrial genome data. The experimental results show the effectiveness of our proposed method compared with other on the human mitochondrial genome data. Conclusions The compression method we proposed can be classified as non-reference based method, but the compression effect is comparable to that of reference based methods. Moreover, our method not only have a well compression results in the population genome with large redundancy, but also in the single genome with small redundancy. The codes of DeepDNA are available at https://github.com/rongjiewang/DeepDNA.

Download Full-text

Study on Mechanical Equipment Fault Diagnosis System Based on Cloud Computing

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.220-223.2520 ◽

2012 ◽

Vol 220-223 ◽

pp. 2520-2523

Author(s):

Wang Shen Hao ◽

Xin Min Dong ◽

Jie Han ◽

Wen Ping Lei

Keyword(s):

Cloud Computing ◽

Parallel Computing ◽

Fault Diagnosis ◽

High Performance ◽

Programming Model ◽

Distributed Storage ◽

Mechanical Equipment ◽

Data Parallel ◽

On Line ◽

Data Parallel Computing

Generally working in severe conditions, mechanical equipments are subjected to progressive deterioration of their state. The mechanical failures account for more than 60% of breakdowns of the system. Therefore, the identification of impending mechanical fault is very important to prevent the system from illness running. It generally requires high performance computer to complete the traditional parallel computing, while the parallel FFT algorithm based on Hadoop MapReduce programming model can be realized in the low-end machines. Combining with Cloud Computing and equipment fault diagnosis technology, it can realize the massive data parallel computing and distributed storage. The result of experiment shows that it would provide a good solution and technical support for mechanical equipment on-line monitoring and real-time fault diagnosis.

Download Full-text

DATA STORAGE FOR PARALLEL COMPUTING IN INDIVIDUAL SOFTWARE ENVIRONMENTS WHEN SOLVING MATERIALS SCIENCE PROBLEMS

10.29003/m2457.mmmsec-2021/11-15 ◽

2021 ◽

Author(s):

Konstantin Volovich ◽

Sergey Denisov

Keyword(s):

Parallel Computing ◽

High Performance Computing ◽

Data Storage ◽

High Performance ◽

Materials Science ◽

Storage System ◽

Parallel Computations ◽

Software Systems ◽

Data Storage System ◽

Performance Computing

The paper discusses methods of data storage when performing parallel computations in a multicomputer high-performance computing complex in virtual software environments. Approaches to building a data storage system using software systems designed to solve problems of materials science are proposed.

Download Full-text

Matlab and Parallel Computing

Image Processing & Communications ◽

10.2478/v10248-012-0048-5 ◽

2012 ◽

Vol 17 (4) ◽

pp. 207-216 ◽

Cited By ~ 5

Author(s):

Magdalena Szymczyk ◽

Piotr Szymczyk

Keyword(s):

Image Processing ◽

Signal Processing ◽

Parallel Computing ◽

Distributed Computing ◽

Control Systems ◽

High Performance ◽

Parallel Applications ◽

Process Simulations ◽

Key Features ◽

Financial Process

Abstract The MATLAB is a technical computing language used in a variety of fields, such as control systems, image and signal processing, visualization, financial process simulations in an easy-to-use environment. MATLAB offers "toolboxes" which are specialized libraries for variety scientific domains, and a simplified interface to high-performance libraries (LAPACK, BLAS, FFTW too). Now MATLAB is enriched by the possibility of parallel computing with the Parallel Computing ToolboxTM and MATLAB Distributed Computing ServerTM. In this article we present some of the key features of MATLAB parallel applications focused on using GPU processors for image processing.

Download Full-text

High-Level Parallel Ant Colony Optimization with Algorithmic Skeletons

International Journal of Parallel Programming ◽

10.1007/s10766-021-00714-1 ◽

2021 ◽

Author(s):

Breno A. de Melo Menezes ◽

Nina Herrmann ◽

Herbert Kuchen ◽

Fernando Buarque de Lima Neto

Keyword(s):

Ant Colony Optimization ◽

High Performance ◽

Optimization Problems ◽

Programming Model ◽

Parallel Implementation ◽

Ant Colony ◽

Algorithmic Skeletons ◽

Low Level ◽

Programming Patterns ◽

High Level

AbstractParallel implementations of swarm intelligence algorithms such as the ant colony optimization (ACO) have been widely used to shorten the execution time when solving complex optimization problems. When aiming for a GPU environment, developing efficient parallel versions of such algorithms using CUDA can be a difficult and error-prone task even for experienced programmers. To overcome this issue, the parallel programming model of Algorithmic Skeletons simplifies parallel programs by abstracting from low-level features. This is realized by defining common programming patterns (e.g. map, fold and zip) that later on will be converted to efficient parallel code. In this paper, we show how algorithmic skeletons formulated in the domain specific language Musket can cope with the development of a parallel implementation of ACO and how that compares to a low-level implementation. Our experimental results show that Musket suits the development of ACO. Besides making it easier for the programmer to deal with the parallelization aspects, Musket generates high performance code with similar execution times when compared to low-level implementations.

Download Full-text

Nonvolatile Ternary Resistive Memory Performance of a Benzothiadiazole-Based Donor–Acceptor Material on ITO-Coated Glass

Coatings ◽

10.3390/coatings11030318 ◽

2021 ◽

Vol 11 (3) ◽

pp. 318

Author(s):

Yang Li ◽

Cheng Zhang ◽

Zhiming Shi ◽

Jingni Li ◽

Qingyun Qian ◽

...

Keyword(s):

Data Storage ◽

High Performance ◽

Memory Performance ◽

Resistive Memory ◽

Molecular Systems ◽

Threshold Voltages ◽

Donor Acceptor ◽

Low Threshold ◽

Multilevel Memory ◽

Further Development

The explosive growth of data and information has increasingly motivated scientific and technological endeavors toward ultra-high-density data storage (UHDDS) applications. Herein, a donor−acceptor (D–A) type small conjugated molecule containing benzothiadiazole (BT) is prepared (NIBTCN), which demonstrates multilevel resistive memory behavior and holds considerable promise for implementing the target of UHDDS. The as-prepared device presents distinct current ratios of 105.2/103.2/1, low threshold voltages of −1.90 V and −3.85 V, and satisfactory reproducibility beyond 60%, which suggests reliable device performance. This work represents a favorable step toward further development of highly-efficient D−A molecular systems, which opens more opportunities for achieving high performance multilevel memory materials and devices.

Download Full-text

Application of High Performance Parallel Computing Based on GPU

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.411-414.585 ◽

2013 ◽

Vol 411-414 ◽

pp. 585-588

Author(s):

Liu Yang ◽

Tie Ying Liu

Keyword(s):

Particle Swarm Optimization ◽

Parallel Computing ◽

Parallel Computation ◽

High Performance ◽

Search Process ◽

Search Rate ◽

Swarm Optimization ◽

Path Search ◽

Parallel Feature ◽

Time And Space Complexity

This paper introduces parallel feature of the GPU, which will help GPU parallel computation methods to achieve the parallelization of PSO parallel path search process; and reduce the increasingly high problem of PSO (PSO: Particle Swarm Optimization) in time and space complexity. The experimental results show: comparing with CPU mode, GPU platform calculation improves the search rate and shortens the calculation time.

Download Full-text

Genome Sequence of the Polysaccharide-Degrading, Thermophilic Anaerobe Spirochaeta thermophila DSM 6192

Journal of Bacteriology ◽

10.1128/jb.01023-10 ◽

2010 ◽

Vol 192 (24) ◽

pp. 6492-6493 ◽

Cited By ~ 13

Author(s):

Angel Angelov ◽

Susanne Liebl ◽

Meike Ballschmiter ◽

Mechthild Bömeke ◽

Rüdiger Lehmann ◽

...

Keyword(s):

Genome Sequence ◽

Complete Genome Sequence ◽

Glycoside Hydrolase ◽

Enzyme System ◽

Cellulose Degradation ◽

Carbohydrate Binding ◽

Carbohydrate Binding Module ◽

Free Living ◽

Genome Data ◽

Genes Encoding

ABSTRACT Spirochaeta thermophila is a thermophilic, free-living anaerobe that is able to degrade various α- and β-linked sugar polymers, including cellulose. We report here the complete genome sequence of S. thermophila DSM 6192, which is the first genome sequence of a thermophilic, free-living member of the Spirochaetes phylum. The genome data reveal a high density of genes encoding enzymes from more than 30 glycoside hydrolase families, a noncellulosomal enzyme system for (hemi)cellulose degradation, and indicate the presence of a novel carbohydrate-binding module.

Download Full-text

High Performance Heterogeneous Data Storage System for High Frequency Sensor Data in a Landslide Laboratory

Advancing Culture of Living with Landslides ◽

10.1007/978-3-319-53498-5_43 ◽

2017 ◽

pp. 371-379

Author(s):

Guntha Ramesh ◽

Hariharan Balaji ◽

T. Hemalatha

Keyword(s):

Data Storage ◽

High Frequency ◽

High Performance ◽

Storage System ◽

Heterogeneous Data ◽

Sensor Data ◽

Frequency Sensor ◽

Data Storage System

Download Full-text

Mobiliti: Scalable Transportation Simulation Using High-Performance Parallel Computing

2018 21st International Conference on Intelligent Transportation Systems (ITSC) ◽

10.1109/itsc.2018.8569397 ◽

2018 ◽

Cited By ~ 2

Author(s):

Cy Chan ◽

Bin Wang ◽

John Bachan ◽

Jane Macfarlane

Keyword(s):

Parallel Computing ◽

High Performance ◽

Transportation Simulation

Download Full-text