distributed computing infrastructures Latest Research Papers

Clustering error messages produced by distributed computing infrastructure during the processing of high energy physics data

International Journal of Modern Physics A ◽

10.1142/s0217751x21500706 ◽

2021 ◽

Vol 36 (10) ◽

pp. 2150070

Author(s):

Maria Grigorieva ◽

Dmitry Grin

Keyword(s):

Distributed Computing ◽

Large Scale ◽

High Energy Physics ◽

High Energy ◽

Machine Learning Algorithms ◽

Service Failures ◽

Fault Handling ◽

Scientific Experiments ◽

Computing Centers ◽

Distributed Computing Infrastructures

Large-scale distributed computing infrastructures ensure the operation and maintenance of scientific experiments at the LHC: more than 160 computing centers all over the world execute tens of millions of computing jobs per day. ATLAS — the largest experiment at the LHC — creates an enormous flow of data which has to be recorded and analyzed by a complex heterogeneous and distributed computing environment. Statistically, about 10–12% of computing jobs end with a failure: network faults, service failures, authorization failures, and other error conditions trigger error messages which provide detailed information about the issue, which can be used for diagnosis and proactive fault handling. However, this analysis is complicated by the sheer scale of textual log data, and often exacerbated by the lack of a well-defined structure: human experts have to interpret the detected messages and create parsing rules manually, which is time-consuming and does not allow identifying previously unknown error conditions without further human intervention. This paper is dedicated to the description of a pipeline of methods for the unsupervised clustering of multi-source error messages. The pipeline is data-driven, based on machine learning algorithms, and executed fully automatically, allowing categorizing error messages according to textual patterns and meaning.

Download Full-text

GigaSOM.jl: High-performance clustering and visualization of huge cytometry datasets

GigaScience ◽

10.1093/gigascience/giaa127 ◽

2020 ◽

Vol 9 (11) ◽

Cited By ~ 1

Author(s):

Miroslav Kratochvíl ◽

Oliver Hunewald ◽

Laurent Heirendt ◽

Vasco Verissimo ◽

Jiří Vondrášek ◽

...

Keyword(s):

Single Cell ◽

High Performance ◽

Current State ◽

Technological Advances ◽

Data Points ◽

Distributed Computing Infrastructures ◽

High Level ◽

Complex Phenomena ◽

Computational Resources ◽

Performance Programming

Abstract Background The amount of data generated in large clinical and phenotyping studies that use single-cell cytometry is constantly growing. Recent technological advances allow the easy generation of data with hundreds of millions of single-cell data points with >40 parameters, originating from thousands of individual samples. The analysis of that amount of high-dimensional data becomes demanding in both hardware and software of high-performance computational resources. Current software tools often do not scale to the datasets of such size; users are thus forced to downsample the data to bearable sizes, in turn losing accuracy and ability to detect many underlying complex phenomena. Results We present GigaSOM.jl, a fast and scalable implementation of clustering and dimensionality reduction for flow and mass cytometry data. The implementation of GigaSOM.jl in the high-level and high-performance programming language Julia makes it accessible to the scientific community and allows for efficient handling and processing of datasets with billions of data points using distributed computing infrastructures. We describe the design of GigaSOM.jl, measure its performance and horizontal scaling capability, and showcase the functionality on a large dataset from a recent study. Conclusions GigaSOM.jl facilitates the use of commonly available high-performance computing resources to process the largest available datasets within minutes, while producing results of the same quality as the current state-of-art software. Measurements indicate that the performance scales to much larger datasets. The example use on the data from a massive mouse phenotyping effort confirms the applicability of GigaSOM.jl to huge-scale studies.

Download Full-text

MULTIDISCIPLINARY NEUROINFORMATICS PROBLEMS FOR EXECUTION IN DISTRIBUTED COMPUTING INFRASTRUCTURES

Systems and Means of Informatics ◽

10.14357/08696527200205 ◽

2020 ◽

Keyword(s):

Distributed Computing ◽

Distributed Computing Infrastructures

Download Full-text

Evaluating Distributed Computing Infrastructures: An Empirical Study Comparing Hadoop Deployments on Cloud and Local Systems

IEEE Transactions on Cloud Computing ◽

10.1109/tcc.2019.2902377 ◽

2019 ◽

pp. 1-1 ◽

Cited By ~ 1

Author(s):

Devipsita Bhattacharya ◽

Faiz Currim ◽

Sudha Ram

Keyword(s):

Distributed Computing ◽

Empirical Study ◽

Local Systems ◽

Distributed Computing Infrastructures

Download Full-text

Security Aspects in Resource Management Systems in Distributed Computing Environments

Foundations of Computing and Decision Sciences ◽

10.1515/fcds-2017-0015 ◽

2017 ◽

Vol 42 (4) ◽

pp. 299-313 ◽

Cited By ~ 1

Author(s):

Marcin Adamski ◽

Krzysztof Kurowski ◽

Marek Mika ◽

Wojciech Piątek ◽

Jan Węglarz

Keyword(s):

Resource Management ◽

Distributed Computing ◽

Experimental Studies ◽

Cyber Attacks ◽

Sensitive Data ◽

Distributed Computing Systems ◽

Computing Environments ◽

Distributed Computing Infrastructures ◽

Management Approaches ◽

Bug Fixes

Abstract In many distributed computing systems, aspects related to security are getting more and more relevant. Security is ubiquitous and could not be treated as a separated problem or a challenge. In our opinion it should be considered in the context of resource management in distributed computing environments like Grids and Clouds, e.g. scheduled computations can be much delayed because of cyber-attacks, inefficient infrastructure or users valuable and sensitive data can be stolen even in the process of correct computation. To prevent such cases there is a need to introduce new evaluation metrics for resource management that will represent the level of security of computing resources and more broadly distributed computing infrastructures. In our approach, we have introduced a new metric called reputation, which simply determines the level of reliability of computing resources from the security perspective and could be taken into account during scheduling procedures. The new reputation metric is based on various relevant parameters regarding cyber-attacks (also energy attacks), administrative activities such as security updates, bug fixes and security patches. Moreover, we have conducted various computational experiments within the Grid Scheduling Simulator environment (GSSIM) inspired by real application scenarios. Finally, our experimental studies of new resource management approaches taking into account critical security aspects are also discussed in this paper.

Download Full-text

Coherent Application Delivery on Hybrid Distributed Computing Infrastructures of Virtual Machines and Docker Containers

2017 25th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP) ◽

10.1109/pdp.2017.29 ◽

2017 ◽

Cited By ~ 4

Author(s):

German Molto ◽

Miguel Caballer ◽

Alfonso Perez ◽

Carlos De Alfonso ◽

Ignacio Blanquer

Keyword(s):

Distributed Computing ◽

Virtual Machines ◽

Distributed Computing Infrastructures

Download Full-text

dispel4py: A Python framework for data-intensive scientific computing

The International Journal of High Performance Computing Applications ◽

10.1177/1094342016649766 ◽

2016 ◽

Vol 31 (4) ◽

pp. 316-334 ◽

Cited By ~ 6

Author(s):

Rosa Filguiera ◽

Amrey Krause ◽

Malcolm Atkinson ◽

Iraklis Klampanos ◽

Alexander Moreno

Keyword(s):

Message Passing ◽

High Performance ◽

Message Passing Interface ◽

Local Development ◽

Distributed Data ◽

Data Streaming ◽

Data Intensive ◽

Distributed Computing Infrastructures ◽

Python Programming ◽

Smooth Transitions

This paper presents dispel4py, a new Python framework for describing abstract stream-based workflows for distributed data-intensive applications. These combine the familiarity of Python programming with the scalability of workflows. Data streaming is used to gain performance, rapid prototyping and applicability to live observations. dispel4py enables scientists to focus on their scientific goals, avoiding distracting details and retaining flexibility over the computing infrastructure they use. The implementation, therefore, has to map dispel4py abstract workflows optimally onto target platforms chosen dynamically. We present four dispel4py mappings: Apache Storm, message-passing interface (MPI), multi-threading and sequential, showing two major benefits: a) smooth transitions from local development on a laptop to scalable execution for production work, and b) scalable enactment on significantly different distributed computing infrastructures. Three application domains are reported and measurements on multiple infrastructures show the optimisations achieved; they have provided demanding real applications and helped us develop effective training. The dispel4py.org is an open-source project to which we invite participation. The effective mapping of dispel4py onto multiple target infrastructures demonstrates exploitation of data-intensive and high-performance computing (HPC) architectures and consistent scalability.

Download Full-text

Multi-criteria and satisfaction oriented scheduling for hybrid distributed computing infrastructures

Future Generation Computer Systems ◽

10.1016/j.future.2015.03.022 ◽

2016 ◽

Vol 55 ◽

pp. 428-443 ◽

Cited By ~ 12

Author(s):

Mircea Moca ◽

Cristian Litan ◽

Gheorghe Cosmin Silaghi ◽

Gilles Fedak

Keyword(s):

Distributed Computing ◽

Distributed Computing Infrastructures

Download Full-text

Synchronizing Execution of Big Data in Distributed and Parallelized Environments

Big Data ◽

10.4018/978-1-4666-9840-6.ch071 ◽

2016 ◽

pp. 1555-1581

Author(s):

Gueyoung Jung ◽

Tridib Mukherjee

Keyword(s):

Big Data ◽

Distributed System ◽

Data Analytics ◽

High Performance ◽

Large Scale ◽

Big Data Analytics ◽

Loosely Coupled ◽

Current Trends ◽

Distributed Computing Infrastructures ◽

Performance Computing

In the modern information era, the amount of data has exploded. Current trends further indicate exponential growth of data in the future. This prevalent humungous amount of data—referred to as big data—has given rise to the problem of finding the “needle in the haystack” (i.e., extracting meaningful information from big data). Many researchers and practitioners are focusing on big data analytics to address the problem. One of the major issues in this regard is the computation requirement of big data analytics. In recent years, the proliferation of many loosely coupled distributed computing infrastructures (e.g., modern public, private, and hybrid clouds, high performance computing clusters, and grids) have enabled high computing capability to be offered for large-scale computation. This has allowed the execution of the big data analytics to gather pace in recent years across organizations and enterprises. However, even with the high computing capability, it is a big challenge to efficiently extract valuable information from vast astronomical data. Hence, we require unforeseen scalability of performance to deal with the execution of big data analytics. A big question in this regard is how to maximally leverage the high computing capabilities from the aforementioned loosely coupled distributed infrastructure to ensure fast and accurate execution of big data analytics. In this regard, this chapter focuses on synchronous parallelization of big data analytics over a distributed system environment to optimize performance.

Download Full-text

Digi-Clima Grid: image processing and distributed computing for recovering historical climate data

CLEI electronic journal ◽

10.19153/cleiej.18.3.4 ◽

2015 ◽

Author(s):

Sergio Nesmachnow ◽

Gabriel Usera ◽

Francisco Brasileiro

Keyword(s):

Image Processing ◽

Parallel Computing ◽

Distributed Computing ◽

Experimental Analysis ◽

Climate Data ◽

Processing Load ◽

Efficient Tool ◽

Historical Climate ◽

Distributed Computing Infrastructures ◽

Cloud Infrastructures

This article describes the Digi-Clima Grid project, whose main goals are to design and implement semi-automatic techniques for digitalizing and recovering historical climate records applying parallel computing techniques over distributed computing infrastructures. The specific tool developed for image processing is described, and the implementation over grid and cloud infrastructures is reported. An experimental analysis over institutional and volunteer-based grid/cloud distributed systems demonstrate that the proposed approach is an efficient tool for recovering historical climate data. The parallel implementations allow to distribute the processing load, achieving accurate speedup values.

Download Full-text

distributed computing infrastructures
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Clustering error messages produced by distributed computing infrastructure during the processing of high energy physics data

GigaSOM.jl: High-performance clustering and visualization of huge cytometry datasets

MULTIDISCIPLINARY NEUROINFORMATICS PROBLEMS FOR EXECUTION IN DISTRIBUTED COMPUTING INFRASTRUCTURES

Evaluating Distributed Computing Infrastructures: An Empirical Study Comparing Hadoop Deployments on Cloud and Local Systems

Security Aspects in Resource Management Systems in Distributed Computing Environments

Coherent Application Delivery on Hybrid Distributed Computing Infrastructures of Virtual Machines and Docker Containers

dispel4py: A Python framework for data-intensive scientific computing

Multi-criteria and satisfaction oriented scheduling for hybrid distributed computing infrastructures

Synchronizing Execution of Big Data in Distributed and Parallelized Environments

Digi-Clima Grid: image processing and distributed computing for recovering historical climate data

Export Citation Format

distributed computing infrastructuresRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Clustering error messages produced by distributed computing infrastructure during the processing of high energy physics data

GigaSOM.jl: High-performance clustering and visualization of huge cytometry datasets

MULTIDISCIPLINARY NEUROINFORMATICS PROBLEMS FOR EXECUTION IN DISTRIBUTED COMPUTING INFRASTRUCTURES

Evaluating Distributed Computing Infrastructures: An Empirical Study Comparing Hadoop Deployments on Cloud and Local Systems

Security Aspects in Resource Management Systems in Distributed Computing Environments

Coherent Application Delivery on Hybrid Distributed Computing Infrastructures of Virtual Machines and Docker Containers

dispel4py: A Python framework for data-intensive scientific computing

Multi-criteria and satisfaction oriented scheduling for hybrid distributed computing infrastructures

Synchronizing Execution of Big Data in Distributed and Parallelized Environments

Digi-Clima Grid: image processing and distributed computing for recovering historical climate data

distributed computing infrastructures
Recently Published Documents