disk layout
Recently Published Documents


TOTAL DOCUMENTS

14
(FIVE YEARS 1)

H-INDEX

3
(FIVE YEARS 0)

2021 ◽  
Vol 251 ◽  
pp. 02066
Author(s):  
Javier López-Gómez ◽  
Jakob Blomer

Over the last two decades, ROOT TTree has been used for storing over one exabyte of High-Energy Physics (HEP) events. The TTree columnar on-disk layout has been proved to be ideal for analyses of HEP data that typically require access to many events, but only a subset of the information stored for each of them. Future colliders, and particularly HL-LHC, will bring an increase of at least one order of magnitude in the volume of generated data. Therefore, the use of modern storage hardware, such as low-latency high-bandwidth NVMe devices and distributed object stores, becomes more important. However, TTree was not designed to optimally exploit modern hardware and may become a bottleneck for data retrieval. The ROOT RNTuple I/O system aims at overcoming TTree’s limitations and at providing improved effciency for modern storage systems. In this paper, we extend RNTuple with a backend that uses Intel DAOS as the underlying storage, demonstrating that the RNTuple architecture can accommodate high-performance object stores. From the user perspective, data can be accessed with minimal changes to the code, that is by replacing a filesystem path by a DAOS URI. Our performance evaluation shows that the new backend can be used for realistic analyses, while outperforming the compatibility solution provided by the DAOS project.


2013 ◽  
Vol 380-384 ◽  
pp. 2195-2199
Author(s):  
Cheng Jiong Wang

This paper discusses about types of file structures in Linux, points out that EXT2 is the most commonly used file system in Linux, analyzes the disk layout, index point and directory structure of EXT2, and studies the method to access files in EXT2 by name, which makes the access faster and more efficient.


2012 ◽  
Vol 16 (3) ◽  
pp. 24-36 ◽  
Author(s):  
Imranul Hoque ◽  
Indranil Gupta

Author(s):  
Feng Chen ◽  
Xiaoning Ding ◽  
Song Jiang

As the major secondary storage device, the hard disk plays a critical role in modern computer system. In order to improve disk performance, most operating systems conduct data prefetch policies by tracking I/O access pattern, mostly at the level of file abstractions. Though such a solution is useful to exploit application-level access patterns, file-level prefetching has many constraints that limit the capability of fully exploiting disk performance. The reasons are twofold. First, certain prefetch opportunities can only be detected by knowing the data layout on the hard disk, such as metadata blocks. Second, due to the non-uniform access cost on the hard disk, the penalty of mis-prefetching a random block is much more costly than mis-prefetching a sequential block. In order to address the intrinsic limitations of filelevel prefetching, we propose to prefetch data blocks directly at the disk level in a portable way. Our proposed scheme, called DiskSeen, is designed to supplement file-level prefetching. DiskSeen observes the workload access pattern by tracking the locations and access times of disk blocks. Based on analysis of the temporal and spatial relationships of disk data blocks, DiskSeen can significantly increase the sequentiality of disk accesses and improve disk performance in turn. We implemented the DiskSeen scheme in the Linux 2.6 kernel and we show that it can significantly improve the effectiveness of filelevel prefetching and reduce execution times by 20-53% for various types of applications, including grep, CVS, and TPC-H.


Sign in / Sign up

Export Citation Format

Share Document