Massive-Parallel Trajectory Calculations version 2.2 (MPTRAC-2.2): Lagrangian transport simulations on Graphics Processing Units (GPUs)

Mapping Intimacies ◽

10.31223/x52s7m ◽

2021 ◽

Author(s):

Lars Hoffmann ◽

Paul Baumeister ◽

Zhongyin Cai ◽

Jan Clemens ◽

Sabine Griessbach ◽

...

Keyword(s):

Graphics Processing Units ◽

Large Scale ◽

Transport Model ◽

Programming Model ◽

Meteorological Data ◽

Transport Processes ◽

Model Verification ◽

Lagrangian Transport ◽

Trajectory Calculations ◽

Graphics Processing

Lagrangian models are powerful tools to study atmospheric transport processes. However, conducting large-scaleLagrangian transport simulations with many air parcels can become numerically rather costly. In this study, we assessed the potential of exploiting graphics processing units (GPUs) to accelerate Lagrangian transport simulations. We ported the Massive-Parallel Trajectory Calculations (MPTRAC) model to GPUs using the open accelerator (OpenACC) programming model. The trajectory calculations conducted within the MPTRAC model have been fully ported to GPUs, i. e., except for feeding in the meteorological input data and for extracting the particle output data, the code operates entirely on the GPU devices without frequent data transfers between CPU and GPU memory. Model verification, performance analyses, and scaling tests of the MPI/OpenMP/OpenACC hybrid parallelization of MPTRAC have been conducted on the JUWELS Booster supercomputer operated by the Jülich Supercomputing Centre, Germany. The JUWELS Booster comprises 3744 NVIDIA A100 Tensor CoreGPUs, providing a peak performance of 71.0 PFlop/s. As of June 2021, it is the most powerful supercomputer in Europe and listed among the most energy-efficient systems internationally. For large-scale simulations comprising 100 million particles driven by the European Centre for Medium-Range Weather Forecasts’ ERA5 reanalysis, the performance evaluation showed a maximum speedup of a factor of 16 due to the utilization of GPUs compared to CPU-only runs on the JUWELS Booster. In the large-scale GPU run, about 67 % of the runtime is spent on the physics calculations, being conducted on the GPUs. Another 15 % of the runtime is required for file-I/O, mostly to read the ERA5 data from disk. Meteorological data preprocessing on the CPUs also requires about 15 % of the runtime. Although this study identified potential for further improvements of the GPU code, we consider the MPTRAC model to be ready for production runs on the JUWELS Booster in its present form. The GPU code provides a much faster time to solution than the CPU code, which is particularly relevant for near-real-time applications of a Lagrangian transport model

Download Full-text

Massive-Parallel Trajectory Calculations version 2.2 (MPTRAC-2.2): Lagrangian transport simulations on Graphics Processing Units (GPUs)

10.5194/gmd-2021-382 ◽

2021 ◽

Author(s):

Lars Hoffmann ◽

Paul F. Baumeister ◽

Zhongyin Cai ◽

Jan Clemens ◽

Sabine Griessbach ◽

...

Keyword(s):

Graphics Processing Units ◽

Large Scale ◽

Transport Model ◽

Meteorological Data ◽

Transport Processes ◽

Model Verification ◽

Data Set ◽

Lagrangian Transport ◽

Trajectory Calculations ◽

Graphics Processing

Abstract. Lagrangian models are fundamental tools to study atmospheric transport processes and for practical applications such as dispersion modeling for anthropogenic and natural emission sources. However, conducting large-scale Lagrangian transport simulations with millions of air parcels or more can become numerically rather costly. In this study, we assessed the potential of exploiting graphics processing units (GPUs) to accelerate Lagrangian transport simulations. We ported the Massive-Parallel Trajectory Calculations (MPTRAC) model to GPUs using the open accelerator (OpenACC) programming model. The trajectory calculations conducted within the MPTRAC model were fully ported to GPUs, i.e., except for feeding in the meteorological input data and for extracting the particle output data, the code operates entirely on the GPU devices without frequent data transfers between CPU and GPU memory. Model verification, performance analyses, and scaling tests of the MPI/OpenMP/OpenACC hybrid parallelization of MPTRAC were conducted on the JUWELS Booster supercomputer operated by the Jülich Supercomputing Centre, Germany. The JUWELS Booster comprises 3744 NVIDIA A100 Tensor Core GPUs, providing a peak performance of 71.0 PFlop/s. As of June 2021, it is the most powerful supercomputer in Europe and listed among the most energy-efficient systems internationally. For large-scale simulations comprising 108 particles driven by the European Centre for Medium-Range Weather Forecasts' ERA5 reanalysis, the performance evaluation showed a maximum speedup of a factor of 16 due to the utilization of GPUs compared to CPU-only runs on the JUWELS Booster. In the large-scale GPU run, about 67 % of the runtime is spent on the physics calculations, conducted on the GPUs. Another 15 % of the runtime is required for file-I/O, mostly to read the large ERA5 data set from disk. Meteorological data preprocessing on the CPUs also requires about 15 % of the runtime. Although this study identified potential for further improvements of the GPU code, we consider the MPTRAC model ready for production runs on the JUWELS Booster in its present form. The GPU code provides a much faster time to solution than the CPU code, which is particularly relevant for near-real-time applications of a Lagrangian transport model.

Download Full-text

Supplementary material to "Massive-Parallel Trajectory Calculations version 2.2 (MPTRAC-2.2): Lagrangian transport simulations on Graphics Processing Units (GPUs)"

10.5194/gmd-2021-382-supplement ◽

2021 ◽

Author(s):

Lars Hoffmann ◽

Paul F. Baumeister ◽

Zhongyin Cai ◽

Jan Clemens ◽

Sabine Griessbach ◽

...

Keyword(s):

Graphics Processing Units ◽

Lagrangian Transport ◽

Trajectory Calculations ◽

Supplementary Material ◽

Graphics Processing

Download Full-text

A lightweight approach to performance portability with targetDP

The International Journal of High Performance Computing Applications ◽

10.1177/1094342016682071 ◽

2016 ◽

Vol 32 (2) ◽

pp. 288-301

Author(s):

Alan Gray ◽

Kevin Stratford

Keyword(s):

Particle Physics ◽

Message Passing ◽

Graphics Processing Units ◽

High Performance ◽

Large Scale ◽

Message Passing Interface ◽

Graphics Processing Unit ◽

Processing Unit ◽

Performance Portability ◽

Graphics Processing

Leading high performance computing systems achieve their status through use of highly parallel devices such as NVIDIA graphics processing units or Intel Xeon Phi many-core CPUs. The concept of performance portability across such architectures, as well as traditional CPUs, is vital for the application programmer. In this paper we describe targetDP, a lightweight abstraction layer which allows grid-based applications to target data parallel hardware in a platform agnostic manner. We demonstrate the effectiveness of our pragmatic approach by presenting performance results for a complex fluid application (with which the model was co-designed), plus separate lattice quantum chromodynamics particle physics code. For each application, a single source code base is seen to achieve portable performance, as assessed within the context of the Roofline model. TargetDP can be combined with Message Passing Interface (MPI) to allow use on systems containing multiple nodes: we demonstrate this through provision of scaling results on traditional and graphics processing unit-accelerated large scale supercomputers.

Download Full-text

Large-scale transient stability simulation on graphics processing units

2009 IEEE Power & Energy Society General Meeting ◽

10.1109/pes.2009.5275844 ◽

2009 ◽

Cited By ~ 14

Author(s):

Vahid Jalili-Marandi ◽

Venkata Dinavahi

Keyword(s):

Graphics Processing Units ◽

Large Scale ◽

Transient Stability ◽

Graphics Processing

Download Full-text

Error correlation between CO2 and CO as constraint for CO2 flux inversions using satellite data

Atmospheric Chemistry and Physics ◽

10.5194/acp-9-7313-2009 ◽

2009 ◽

Vol 9 (19) ◽

pp. 7313-7323 ◽

Cited By ~ 25

Author(s):

H. Wang ◽

D. J. Jacob ◽

M. Kopacz ◽

D. B. A. Jones ◽

P. Suntharalingam ◽

...

Keyword(s):

Inverse Modeling ◽

Large Scale ◽

Transport Model ◽

Meteorological Data ◽

Correlation Coefficients ◽

Surface Fluxes ◽

Satellite Observations ◽

Carbon Surface ◽

Chemical Transport Model ◽

Error Correlation

Abstract. Inverse modeling of CO2 satellite observations to better quantify carbon surface fluxes requires a chemical transport model (CTM) to relate the fluxes to the observed column concentrations. CTM transport error is a major source of uncertainty. We show that its effect can be reduced by using CO satellite observations as additional constraint in a joint CO2-CO inversion. CO is measured from space with high precision, is strongly correlated with CO2, and is more sensitive than CO2 to CTM transport errors on synoptic and smaller scales. Exploiting this constraint requires statistics for the CTM transport error correlation between CO2 and CO, which is significantly different from the correlation between the concentrations themselves. We estimate the error correlation globally and for different seasons by a paired-model method (comparing GEOS-Chem CTM simulations of CO2 and CO columns using different assimilated meteorological data sets for the same meteorological year) and a paired-forecast method (comparing 48- vs. 24-h GEOS-5 CTM forecasts of CO2 and CO columns for the same forecast time). We find strong error correlations (r2>0.5) between CO2 and CO columns over much of the extra-tropical Northern Hemisphere throughout the year, and strong consistency between different methods to estimate the error correlation. Application of the averaging kernels used in the retrieval for thermal IR CO measurements weakens the correlation coefficients by 15% on average (mostly due to variability in the averaging kernels) but preserves the large-scale correlation structure. We present a simple inverse modeling application to demonstrate that CO2-CO error correlations can indeed significantly reduce uncertainty on surface carbon fluxes in a joint CO2-CO inversion vs. a CO2-only inversion.

Download Full-text

The Effects of Scaling and Model Complexity in Simulating the Transport of Inorganic Micropollutants in a Lowland River Reach

Water Quality Research Journal ◽

10.2166/wqrj.2006.003 ◽

2006 ◽

Vol 41 (1) ◽

pp. 24-36 ◽

Cited By ~ 10

Author(s):

Karl-Erich Lindenschmidt ◽

René Wodrich ◽

Cornelia Hesse

Keyword(s):

Large Scale ◽

Transport Model ◽

Transport Processes ◽

Model Complexity ◽

Scale Model ◽

Small Scale ◽

Quality Model ◽

Large Scale Model ◽

Iron And Zinc ◽

Physical And Chemical

Abstract A hypothesis stating that more complex descriptions of processes in models simulate reality better (less error) but with more unreliable predictability (more sensitivity) is tested using a river water quality model. This hypothesis was extended stating that applying the model on a domain of smaller scale requires greater complexity to capture the same accuracy as in large-scale model applications which, however, leads to increased model sensitivity. The sediment and pollutant transport model TOXI, a module in the WASP5 package, was applied to two case studies of different scale: a 90-km course of the 5th order (sensu Strahler 1952) lower Saale river, Germany (large scale), and the lock-and-weir system at Calbe (small scale) situated on the same river course. A sensitivity analysis of several parameters relating to the physical and chemical transport processes of suspended solids, chloride, arsenic, iron and zinc shows that the coefficient, which partitions the total heavy metal mass into its dissolved and sorbed fraction, is a very sensitive parameter. Hence, the complexity of the sorptive process was varied to test the hypotheses.

Download Full-text

Large-scale analytical Fourier transform of photomask layouts using graphics processing units

10.1117/12.2192040 ◽

2015 ◽

Author(s):

Julia A. Sakamoto

Keyword(s):

Fourier Transform ◽

Graphics Processing Units ◽

Large Scale ◽

Graphics Processing

Download Full-text

Toward large-scale Hybrid Monte Carlo simulations of the Hubbard model on graphics processing units

Computer Physics Communications ◽

10.1016/j.cpc.2011.04.014 ◽

2011 ◽

Vol 182 (8) ◽

pp. 1651-1656 ◽

Cited By ~ 5

Author(s):

Kyle A. Wendt ◽

Joaquín E. Drut ◽

Timo A. Lähde

Keyword(s):

Monte Carlo ◽

Monte Carlo Simulations ◽

Hubbard Model ◽

Graphics Processing Units ◽

Large Scale ◽

Hybrid Monte Carlo ◽

Graphics Processing

Download Full-text

Error correlation between CO2 and CO as constraint for CO2 flux inversions using satellite data

Atmospheric Chemistry and Physics Discussions ◽

10.5194/acpd-9-11783-2009 ◽

2009 ◽

Vol 9 (3) ◽

pp. 11783-11810

Author(s):

H. Wang ◽

D. J. Jacob ◽

M. Kopacz ◽

D. B. A. Jones ◽

P. Suntharalingam ◽

...

Keyword(s):

Inverse Modeling ◽

Large Scale ◽

Transport Model ◽

Meteorological Data ◽

Co2 Flux ◽

Surface Fluxes ◽

Satellite Observations ◽

Carbon Surface ◽

Chemical Transport Model ◽

Error Correlation

Abstract. Inverse modeling of CO2 satellite observations to better quantify carbon surface fluxes requires a forward model such as a chemical transport model (CTM) to relate the fluxes to the observed column concentrations. Model transport error is an important source of observational error. We investigate the potential of using CO satellite observations as additional constraints in a joint CO2–CO inversion to improve CO2 flux estimates, by exploiting the CTM transport error correlations between CO2 and CO. We estimate the error correlation globally and for different seasons by a paired-model method (comparing CTM simulations of CO2 and CO columns using different assimilated meteorological data sets for the same meteorological year) and a paired-forecast method (comparing 48- vs. 24-h CTM forecasts of CO2 and CO columns for the same forecast time). We find strong positive and negative error correlations (r2>0.5) between CO2 and CO columns over much of the world throughout the year, and strong consistency between different methods to estimate the error correlation. Application of the averaging kernels used in the retrieval for thermal IR CO measurements weakens the correlation coefficients by 15% on average (mostly due to variability in the averaging kernels) but preserves the large-scale correlation structure. Results from a testbed inverse modeling application show that CO2–CO error correlations can indeed significantly reduce uncertainty on surface carbon fluxes in a joint CO2–CO inversion vs. a CO2–only inversion.

Download Full-text

Impact of Lagrangian transport on lower-stratospheric transport timescales in a climate model

Atmospheric Chemistry and Physics ◽

10.5194/acp-20-15227-2020 ◽

2020 ◽

Vol 20 (23) ◽

pp. 15227-15245

Author(s):

Edward J. Charlesworth ◽

Ann-Kristin Dugstad ◽

Frauke Fritsch ◽

Patrick Jöckel ◽

Felix Plöger

Keyword(s):

Atmospheric Chemistry ◽

Large Scale ◽

Climate Model ◽

Polar Vortex ◽

Transport Processes ◽

Lagrangian Model ◽

Trace Gas ◽

Lagrangian Transport ◽

Transport Barriers ◽

The Impact

Abstract. We investigate the impact of model trace gas transport schemes on the representation of transport processes in the upper troposphere and lower stratosphere. Towards this end, the Chemical Lagrangian Model of the Stratosphere (CLaMS) was coupled to the ECHAM/MESSy Atmospheric Chemistry (EMAC) model and results from the two transport schemes (Lagrangian critical Lyapunov scheme and flux-form semi-Lagrangian, respectively) were compared. Advection in CLaMS was driven by the EMAC simulation winds, and thereby the only differences in transport between the two sets of results were caused by differences in the transport schemes. To analyze the timescales of large-scale transport, multiple tropical-surface-emitted tracer pulses were performed to calculate age of air spectra, while smaller-scale transport was analyzed via idealized, radioactively decaying tracers emitted in smaller regions (nine grid cells) within the stratosphere. The results show that stratospheric transport barriers are significantly stronger for Lagrangian EMAC-CLaMS transport due to reduced numerical diffusion. In particular, stronger tracer gradients emerge around the polar vortex, at the subtropical jets, and at the edge of the tropical pipe. Inside the polar vortex, the more diffusive EMAC flux-form semi-Lagrangian transport scheme results in a substantially higher amount of air with ages from 0 to 2 years (up to a factor of 5 higher). In the lowermost stratosphere, mean age of air is much smaller in EMAC, owing to stronger diffusive cross-tropopause transport. Conversely, EMAC-CLaMS shows a summertime lowermost stratosphere age inversion – a layer of older air residing below younger air (an “eave”). This pattern is caused by strong poleward transport above the subtropical jet and is entirely blurred by diffusive cross-tropopause transport in EMAC. Potential consequences from the choice of the transport scheme on chemistry–climate and geoengineering simulations are discussed.

Download Full-text

Massive-Parallel Trajectory Calculations version 2.2 (MPTRAC-2.2): Lagrangian transport simulations on Graphics Processing Units (GPUs)

Massive-Parallel Trajectory Calculations version 2.2 (MPTRAC-2.2): Lagrangian transport simulations on Graphics Processing Units (GPUs)

Supplementary material to "Massive-Parallel Trajectory Calculations version 2.2 (MPTRAC-2.2): Lagrangian transport simulations on Graphics Processing Units (GPUs)"

A lightweight approach to performance portability with targetDP

Large-scale transient stability simulation on graphics processing units

Error correlation between CO<sub>2</sub> and CO as constraint for CO<sub>2</sub> flux inversions using satellite data

The Effects of Scaling and Model Complexity in Simulating the Transport of Inorganic Micropollutants in a Lowland River Reach

Large-scale analytical Fourier transform of photomask layouts using graphics processing units

Toward large-scale Hybrid Monte Carlo simulations of the Hubbard model on graphics processing units

Error correlation between CO<sub>2</sub> and CO as constraint for CO<sub>2</sub> flux inversions using satellite data

Impact of Lagrangian transport on lower-stratospheric transport timescales in a climate model

Export Citation Format