Massive High-Performance Global File Systems for Grid computing

Author(s):  
P. Andrews ◽  
P. Kovatch ◽  
C. Jordan
2021 ◽  
Vol 17 (3) ◽  
pp. 1-25
Author(s):  
Bohong Zhu ◽  
Youmin Chen ◽  
Qing Wang ◽  
Youyou Lu ◽  
Jiwu Shu

Non-volatile memory and remote direct memory access (RDMA) provide extremely high performance in storage and network hardware. However, existing distributed file systems strictly isolate file system and network layers, and the heavy layered software designs leave high-speed hardware under-exploited. In this article, we propose an RDMA-enabled distributed persistent memory file system, Octopus + , to redesign file system internal mechanisms by closely coupling non-volatile memory and RDMA features. For data operations, Octopus + directly accesses a shared persistent memory pool to reduce memory copying overhead, and actively fetches and pushes data all in clients to rebalance the load between the server and network. For metadata operations, Octopus + introduces self-identified remote procedure calls for immediate notification between file systems and networking, and an efficient distributed transaction mechanism for consistency. Octopus + is enabled with replication feature to provide better availability. Evaluations on Intel Optane DC Persistent Memory Modules show that Octopus + achieves nearly the raw bandwidth for large I/Os and orders of magnitude better performance than existing distributed file systems.


2020 ◽  
Vol 26 (1) ◽  
pp. 89-106
Author(s):  
Kohei Hiraga ◽  
Osamu Tatebe ◽  
Hideyuki Kawashima

Metadata performance scalability is critically important in high-performance computing when accessing many small files from millions of clients. This paper proposes a design of a scalable distributed metadata server, PPMDS, for parallel file systems using multiple key-value servers. In PPMDS, hierarchical namespace of a file system is efficiently managed by multiple servers. Multiple entries can be atomically updated using a nonblocking distributed transaction based on an algorithm of dynamic software transactional memory. This paper also proposes optimizations to further improve the metadata performance by introducing a server-side transaction processing, multiple readers, and a shared lock mode, which reduce the number of remote procedure calls and prevent unnecessary blocking. Performance evaluation shows the scalable performance up to 3 servers, and achieves 62,000 operations per second, which is 2.58x performance improvement compared to a single metadata performance.


Author(s):  
Maizura Ibrahim ◽  
Hamidah Ibrahim ◽  
Azizol Abdullah ◽  
Rohaya Latip

Author(s):  
Armando Fandango ◽  
William Rivera

Scientific Big Data being gathered at exascale needs to be stored, retrieved and manipulated. The storage stack for scientific Big Data includes a file system at the system level for physical organization of the data, and a file format and input/output (I/O) system at the application level for logical organization of the data; both of them of high-performance variety for exascale. The high-performance file system is designed with concurrent access, high-speed transmission and fault tolerance characteristics. High-performance file formats and I/O are designed to allow parallel and distributed applications with easy and fast access to Big Data. These specialized file formats make it easier to store and access Big Data for scientific visualization and predictive analytics. This chapter provides a brief review of the characteristics of high-performance file systems such as Lustre and GPFS, and high-performance file formats such as HDF5, NetCDF, MPI-IO, and HDFS.


Author(s):  
Jagdish Chandra Patni

Powerful computational capabilities and resource availability at a low cost is the utmost demand for high performance computing. The resources for computing can viewed as the edges of an interconnected grid. It can attain the capabilities of grid computing by balancing the load at various levels. Since the nature of resources are heterogeneous and distributed geographically, the grid computing paradigm in its original form cannot be used to meet the requirements, so it can use the capabilities of the cloud and other technologies to achieve the goal. Resource heterogeneity makes grid computing more dynamic and challenging. Therefore, in this article the problem of scalability, heterogeneity and adaptability of grid computing is discussed with a perspective of providing high computing, load balancing and availability of resources.


Sign in / Sign up

Export Citation Format

Share Document