Session details: Data Movement II

Numerical algorithms for high-performance computational science

Philosophical Transactions of The Royal Society A Mathematical Physical and Engineering Sciences ◽

10.1098/rsta.2019.0066 ◽

2020 ◽

Vol 378 (2166) ◽

pp. 20190066 ◽

Cited By ~ 2

Author(s):

Jack Dongarra ◽

Laura Grigori ◽

Nicholas J. Higham

Keyword(s):

High Performance ◽

Numerical Algorithms ◽

Computational Science ◽

Floating Point ◽

Important Criterion ◽

Data Movement ◽

Floating Point Arithmetic ◽

High Performance Computers ◽

Point Arithmetic ◽

Speed And Accuracy

A number of features of today’s high-performance computers make it challenging to exploit these machines fully for computational science. These include increasing core counts but stagnant clock frequencies; the high cost of data movement; use of accelerators (GPUs, FPGAs, coprocessors), making architectures increasingly heterogeneous; and multi- ple precisions of floating-point arithmetic, including half-precision. Moreover, as well as maximizing speed and accuracy, minimizing energy consumption is an important criterion. New generations of algorithms are needed to tackle these challenges. We discuss some approaches that we can take to develop numerical algorithms for high-performance computational science, with a view to exploiting the next generation of supercomputers. This article is part of a discussion meeting issue ‘Numerical algorithms for high-performance computational science’.

Download Full-text

Improving communication by optimizing on-node data movement with data layout

Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming ◽

10.1145/3437801.3441598 ◽

2021 ◽

Author(s):

Tuowen Zhao ◽

Mary Hall ◽

Hans Johansen ◽

Samuel Williams

Keyword(s):

Data Layout ◽

Data Movement

Download Full-text

Categorification of the Müller-Wichards System Performance Estimation Model: Model Symmetries, Invariants, and Closed Forms

Systems ◽

10.3390/systems7010006 ◽

2019 ◽

Vol 7 (1) ◽

pp. 6

Author(s):

Allen D. Parks ◽

David J. Marchette

Keyword(s):

Closed Form ◽

System Performance ◽

Algebraic Method ◽

Expressive Power ◽

Computer Applications ◽

Performance Estimation ◽

Parallel Computer ◽

Estimation Model ◽

Data Movement ◽

Fundamental Symmetry

The Müller-Wichards model (MW) is an algebraic method that quantitatively estimates the performance of sequential and/or parallel computer applications. Because of category theory’s expressive power and mathematical precision, a category theoretic reformulation of MW, i.e., CMW, is presented in this paper. The CMW is effectively numerically equivalent to MW and can be used to estimate the performance of any system that can be represented as numerical sequences of arithmetic, data movement, and delay processes. The CMW fundamental symmetry group is introduced and CMW’s category theoretic formalism is used to facilitate the identification of associated model invariants. The formalism also yields a natural approach to dividing systems into subsystems in a manner that preserves performance. Closed form models are developed and studied statistically, and special case closed form models are used to abstractly quantify the effect of parallelization upon processing time vs. loading, as well as to establish a system performance stationary action principle.

Download Full-text

Classifying Data to Reduce Long-Term Data Movement in Shingled Write Disks

ACM Transactions on Storage ◽

10.1145/2851505 ◽

2016 ◽

Vol 12 (1) ◽

pp. 1-17 ◽

Cited By ~ 7

Author(s):

Stephanie N. Jones ◽

Ahmed Amer ◽

Ethan L. Miller ◽

Darrell D. E. Long ◽

Rekha Pitchumani ◽

...

Keyword(s):

Data Movement ◽

Term Data

Download Full-text

Reducing Data Movement on Large Shared Memory Systems by Exploiting Computation Dependencies

Proceedings of the 2018 International Conference on Supercomputing - ICS '18 ◽

10.1145/3205289.3205310 ◽

2018 ◽

Cited By ~ 5

Author(s):

Isaac Sánchez Barrera ◽

Miquel Moretó ◽

Eduard Ayguadé ◽

Jesús Labarta ◽

Mateo Valero ◽

...

Keyword(s):

Shared Memory ◽

Memory Systems ◽

Data Movement

Download Full-text

Hardware-Accelerated Dual-Split Trees

Proceedings of the ACM on Computer Graphics and Interactive Techniques ◽

10.1145/3406185 ◽

2020 ◽

Vol 3 (2) ◽

pp. 1-21

Author(s):

Daqi Lin ◽

Elena Vasiou ◽

Cem Yuksel ◽

Daniel Kopta ◽

Erik Brunvand

Keyword(s):

Ray Tracing ◽

Hardware Acceleration ◽

Memory Storage ◽

Compact Representation ◽

Space Partitioning ◽

Data Movement ◽

Bounding Volume ◽

Bounding Boxes ◽

Split Trees ◽

Bounding Volume Hierarchies

Bounding volume hierarchies (BVH) are the most widely used acceleration structures for ray tracing due to their high construction and traversal performance. However, the bounding planes shared between parent and children bounding boxes is an inherent storage redundancy that limits further improvement in performance due to the memory cost of reading these redundant planes. Dual-split trees can create identical space partitioning as BVHs, but in a compact form using less memory by eliminating the redundancies of the BVH structure representation. This reduction in memory storage and data movement translates to faster ray traversal and better energy efficiency. Yet, the performance benefits of dual-split trees are undermined by the processing required to extract the necessary information from their compact representation. This involves bit manipulations and branching instructions which are inefficient in software. We introduce hardware acceleration for dual-split trees and show that the performance advantages over BVHs are emphasized in a hardware ray tracing context that can take advantage of such acceleration. We provide details on how the operations needed for decoding dual-split tree nodes can be implemented in hardware and present experiments in a number of scenes with different sizes using path tracing. In our experiments, we have observed up to 31% reduction in render time and 38% energy saving using dual-split trees as compared to binary BVHs representing identical space partitioning.

Download Full-text

Understanding transparency of government from a Nordic perspective: open government and open data movement as a multidimensional collaborative phenomenon in Sweden

Journal of Global Information Technology Management ◽

10.1080/1097198x.2017.1388696 ◽

2017 ◽

Vol 20 (4) ◽

pp. 236-275 ◽

Cited By ~ 9

Author(s):

Maxat Kassen

Keyword(s):

Open Data ◽

Open Government ◽

Data Movement ◽

Open Data Movement

Download Full-text

Automatic data movement and computation mapping for multi-level parallel architectures with explicitly managed memories

Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming - PPoPP '08 ◽

10.1145/1345206.1345210 ◽

2008 ◽

Cited By ~ 40

Author(s):

Muthu Manikandan Baskaran ◽

Uday Bondhugula ◽

Sriram Krishnamoorthy ◽

J. Ramanujam ◽

Atanas Rountev ◽

...

Keyword(s):

Parallel Architectures ◽

Automatic Data ◽

Data Movement ◽

Multi Level

Download Full-text

Development of a flow-based planning support system based on open data for the City of Atlanta

Environment and Planning B Urban Analytics and City Science ◽

10.1177/2399808317705881 ◽

2017 ◽

Vol 46 (2) ◽

pp. 207-224

Author(s):

Ge Zhang ◽

Wenwen Zhang ◽

Subhrajit Guhathakurta ◽

Nisha Botchwey

Keyword(s):

Support System ◽

Open Data ◽

Community Planning ◽

Relevant Information ◽

Easy Access ◽

Data Movement ◽

Planning Support ◽

Planning Support System ◽

Analytical Tools ◽

The City

Open data have come of age with many cities, states, and other jurisdictions joining the open data movement by offering relevant information about their communities for free and easy access to the public. Despite the growing volume of open data, their use has been limited in planning scholarship and practice. The bottleneck is often the format in which the data are available and the organization of such data, which may be difficult to incorporate in existing analytical tools. The overall goal of this research is to develop an open data-based community planning support system that can collect related open data, analyze the data for specific objectives, and visualize the results to improve usability. To accomplish this goal, this study undertakes three research tasks. First, it describes the current state of open data analysis efforts in the community planning field. Second, it examines the challenges analysts experience when using open data in planning analysis. Third, it develops a new flow-based planning support system for examining neighborhood quality of life and health for the City of Atlanta as a prototype, which addresses many of these open data challenges.

Download Full-text

ASSESSMENT OF LINUX' DATA PATH IMPLEMENTATIONS FOR DOWNLOAD AND STREAMING

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194007003343 ◽

2007 ◽

Vol 17 (04) ◽

pp. 465-481 ◽

Cited By ~ 1

Author(s):

PÅL HALVORSEN ◽

TOM ANDERS DALSENG ◽

CARSTEN GRIWODZ

Keyword(s):

Operating Systems ◽

Comprehensive Evaluation ◽

Streaming Data ◽

High Rate ◽

Data Path ◽

Power Budget ◽

Data Movement ◽

Technological Advances ◽

Reduced Power Consumption ◽

Streaming Systems

Distributed multimedia streaming systems are increasingly popular due to technological advances, and numerous streaming services are available today. On servers or proxy caches, there is a huge scaling challenge in supporting thousands of concurrent users that request delivery of high-rate, time-dependent data like audio and video, because this requires transfers of large amounts of data through several sub-systems within a streaming node. Unnecessary copy operations in the data path can therefore contribute significantly to the resource consumption of streaming operations. Despite previous research, off-the-shelf operating systems have only limited support for data paths that have been optimized for streaming. Additionally, system call overhead has grown with newer operating systems editions, adding to the cost of data movement. Frequently, it is argued that these issues can be ignored because of the continuing growth of CPU speeds. However, such an argument fails to take problems of modern streaming systems into account. The dissipation of heat generated by disks and high-end CPUs is a major problem of data centers, which would be alleviated if less power-hungry CPUs could be used. The power budget of mobile devices, which are increasingly used for streaming as well, is tight, and reduced power consumption an important issue. In this paper, we prove that these operations consume a large amount of resources, and we therefore revisit the data movement problem and provide a comprehensive evaluation of possible streaming data I/O paths in the Linux 2.6 kernel. We have implemented and evaluated several enhanced mechanisms and show how to provide support for more efficient memory usage and reduction of user/kernel space switches for content download and streaming applications. In particular, we are able to reduce the CPU usage by approximately 27% compared to the best approach without kernel modifications, by removing copy operations and system calls for a streaming scenario in which RTP headers must be added to stored data for sequence numbers and timing.

Download Full-text