scholarly journals Optimized Data Transfers Based on the OpenCL Event Management Mechanism

2015 ◽  
Vol 2015 ◽  
pp. 1-16 ◽  
Author(s):  
Hiroyuki Takizawa ◽  
Shoichi Hirasawa ◽  
Makoto Sugawara ◽  
Isaac Gelado ◽  
Hiroaki Kobayashi ◽  
...  

In standard OpenCL programming, hosts are supposed to control their compute devices. Since compute devices are dedicated to kernel computation, only hosts can execute several kinds of data transfers such as internode communication and file access. These data transfers require one host to simultaneously play two or more roles due to the need for collaboration between the host and devices. The codes for such data transfers are likely to be system-specific, resulting in low portability. This paper proposes an OpenCL extension that incorporates such data transfers into the OpenCL event management mechanism. Unlike the current OpenCL standard, the main thread running on the host is not blocked to serialize dependent operations. Hence, an application can easily use the opportunities to overlap parallel activities of hosts and compute devices. In addition, the implementation details of data transfers are hidden behind the extension, and application programmers can use the optimized data transfers without any tricky programming techniques. The evaluation results show that the proposed extension can use the optimized data transfer implementation and thereby increase the sustained data transfer performance by about 18% for a real application accessing a big data file.

2014 ◽  
pp. 316-323
Author(s):  
Tevaganthan Veluppillai ◽  
Brandon Ortiz ◽  
Robert E. Hiromoto

Several well-known data transfer protocols are presented in a comparative study to address the issue of big data transfer for tablet-class machines. The data transfer protocols include standard Java and C++, and block-data transfers protocols that use both the Java New IO (NIO) and the Zerocopy libraries, and a block-data C++ transfer protocol. Several experiments are described and results compared against the standard Java IO and C++ (stream-based file transport protocols). The motivation for this study is the development of a client/server big data file transport protocol for tablet-class client machines that rely on the Java Remote Method Invocation (RMI) package for distributed computing.


2018 ◽  
Vol 2018 ◽  
pp. 1-8
Author(s):  
Taeuk Kim ◽  
Awais Khan ◽  
Youngjae Kim ◽  
Preethika Kasu ◽  
Scott Atchley

The evergrowing trend of big data has led scientists to share and transfer the simulation and analytical data across the geodistributed research and computing facilities. However, the existing data transfer frameworks used for data sharing lack the capability to adopt the attributes of the underlying parallel file systems (PFS). LADS (Layout-Aware Data Scheduling) is an end-to-end data transfer tool optimized for terabit network using a layout-aware data scheduling via PFS. However, it does not consider the NUMA (Nonuniform Memory Access) architecture. In this paper, we propose a NUMA-aware thread and resource scheduling for optimized data transfer in terabit network. First, we propose distributed RMA buffers to reduce memory controller contention in CPU sockets and then schedule the threads based on CPU socket and NUMA nodes inside CPU socket to reduce memory access latency. We design and implement the proposed resource and thread scheduling in the existing LADS framework. Experimental results showed from 21.7% to 44% improvement with memory-level optimizations in the LADS framework as compared to the baseline without any optimization.


2020 ◽  
Vol 22 (2) ◽  
pp. 130-144
Author(s):  
Aiqin Hou ◽  
Chase Qishi Wu ◽  
Liudong Zuo ◽  
Xiaoyang Zhang ◽  
Tao Wang ◽  
...  

2018 ◽  
Vol 8 (11) ◽  
pp. 2216
Author(s):  
Jiahui Jin ◽  
Qi An ◽  
Wei Zhou ◽  
Jiakai Tang ◽  
Runqun Xiong

Network bandwidth is a scarce resource in big data environments, so data locality is a fundamental problem for data-parallel frameworks such as Hadoop and Spark. This problem is exacerbated in multicore server-based clusters, where multiple tasks running on the same server compete for the server’s network bandwidth. Existing approaches solve this problem by scheduling computational tasks near the input data and considering the server’s free time, data placements, and data transfer costs. However, such approaches usually set identical values for data transfer costs, even though a multicore server’s data transfer cost increases with the number of data-remote tasks. Eventually, this hampers data-processing time, by minimizing it ineffectively. As a solution, we propose DynDL (Dynamic Data Locality), a novel data-locality-aware task-scheduling model that handles dynamic data transfer costs for multicore servers. DynDL offers greater flexibility than existing approaches by using a set of non-decreasing functions to evaluate dynamic data transfer costs. We also propose online and offline algorithms (based on DynDL) that minimize data-processing time and adaptively adjust data locality. Although DynDL is NP-complete (nondeterministic polynomial-complete), we prove that the offline algorithm runs in quadratic time and generates optimal results for DynDL’s specific uses. Using a series of simulations and real-world executions, we show that our algorithms are 30% better than algorithms that do not consider dynamic data transfer costs in terms of data-processing time. Moreover, they can adaptively adjust data localities based on the server’s free time, data placement, and network bandwidth, and schedule tens of thousands of tasks within subseconds or seconds.


2021 ◽  
Vol 251 ◽  
pp. 01049
Author(s):  
Yang Xin ◽  
Yinning He

The analysis of enterprise economic operation is mainly to collect, integrate, analyze and store information on the economic activities of the enterprise, so as to realize the standardization and management of the overall economic activities of the enterprise. In the era of big data, the analysis of corporate economic operations has also changed. In short, the big data environment has changed the corporate economic analysis and management mechanism, and has changed the collection of corporate operating economic data. It also has multiple impacts on data analysis and changed the overall structure of corporate economic analysis. This article will briefly discuss the challenges and countermeasures facing enterprise economic operation analysis in the era of big data, hoping to provide a reference for enterprise economic management.


Big Data ◽  
2016 ◽  
pp. 43-95
Author(s):  
Se-young Yu ◽  
Nevil Brownlee ◽  
Aniket Mahanti

1992 ◽  
Vol 82 (1) ◽  
pp. 497-504
Author(s):  
R. W. E. Green

Abstract Observations of teleseismic events at remote sites necessitated the development of a portable digital recorder that is capable of continuously recording the output of a three-component set of long-period transducers. A PC is used as a file management facility, operating in an intermittant or “sleeper mode.” Each of the three components are digitized and stored in separate, intelligent A to D cards. When 28 K samples have been generated, a trigger is initiated, and on the transition of the next real time second the real time is latched and power is applied to the PC. The sample count between the trigger and the latched acknowledgment of the trigger provides an absolute time correlation. After the PC has powered up, the data are down-loaded from the three acquisition cards to a PC hard disk and the latched real time forms the header label of the data file. Power is then removed from the PC. Sampling at about 15 samples per second, the PC is switched on every 33, 45 minutes. Boot-up and data down-loading uses approximately 5 watts average power. The associated long-period transducers (Guralp CMG3) consume about 3 watts and the remaining electronics 2 watts. All the electronics are housed in a steel cabinet, and the system uses four solar panels charging two 105AH batteries. Data transfer to an internal 60 MByte tape streamer necessitates a visit to the station every 24 days.


Sign in / Sign up

Export Citation Format

Share Document