Performance Modeling based on Multidimensional Surface Learning for Performance Predictions of Parallel Applications in Non-Dedicated Environments

Author(s):  
J. Yagnik ◽  
H.A. Sanjay ◽  
S. Vadhiyar
Author(s):  
Diane Kuhl Mitchell ◽  
Charneta Samms

For at least a decade, researchers at the Army Research Laboratory (ARL) have predicted mental workload using human performance modeling (HPM) tools, primarily IMPRINT. During this timeframe their projects have matured from simple models of human behavior to complex analyses of the interactions of system design and human behavior. As part of this maturation process, the researchers learned: 1) to develop a modeling question that incorporates all aspects of workload, 2) to determine when workload is most likely to affect performance, 3) to build multiple models to represent experimental conditions, 4) to connect performance predictions to an overall mission or system capability, and 5) to format results in a clear, concise format. By implementing the techniques they developed from these lessons learned, the researchers have had an impact on major Army programs with their workload predictions. Specifically, they have successfully changed design requirements for future concept Army vehicles, substantiated manpower requirements for fielded Army vehicles, and made Soldier workload the number one item during preliminary design review for a major Army future concept vehicle program. The effective techniques the ARL researchers developed for their IMPRINT projects are applicable to other HPM tools. In addition, they can be used by students and researchers who are doing human performance modeling projects and are confronted with similar problems to help them achieve project success.


Author(s):  
James D Stevens ◽  
Andreas Klöckner

The ability to model, analyze, and predict execution time of computations is an important building block that supports numerous efforts, such as load balancing, benchmarking, job scheduling, developer-guided performance optimization, and the automation of performance tuning for high performance, parallel applications. In today’s increasingly heterogeneous computing environment, this task must be accomplished efficiently across multiple architectures, including massively parallel coprocessors like GPUs, which are increasingly prevalent in the world’s fastest supercomputers. To address this challenge, we present an approach for constructing customizable, cross-machine performance models for GPU kernels, including a mechanism to automatically and symbolically gather performance-relevant kernel operation counts, a tool for formulating mathematical models using these counts, and a customizable parameterized collection of benchmark kernels used to calibrate models to GPUs in a black-box fashion. With this approach, we empower the user to manage trade-offs between model accuracy, evaluation speed, and generalizability. A user can define their own model and customize the calibration process, making it as simple or complex as desired, and as application-targeted or general as desired. As application examples of our approach, we demonstrate both linear and nonlinear models; these examples are designed to predict execution times for multiple variants of a particular computation: two matrix-matrix multiplication variants, four discontinuous Galerkin differentiation operation variants, and two 2D five-point finite difference stencil variants. For each variant, we present accuracy results on GPUs from multiple vendors and hardware generations. We view this highly user-customizable approach as a response to a central question arising in GPU performance modeling: how can we model GPU performance in a cost-explanatory fashion while maintaining accuracy, evaluation speed, portability, and ease of use, an attribute we believe precludes approaches requiring manual collection of kernel or hardware statistics.


Author(s):  
Marco Lattuada ◽  
Christian Pilato ◽  
Antonino Tumeo ◽  
Fabrizio Ferrandi

2014 ◽  
Vol 11 (4) ◽  
pp. 1315-1336 ◽  
Author(s):  
Stanislav Böhm ◽  
Marek Bĕhálek ◽  
Ondřej Meca ◽  
Martin Surkovský

In our research, we try to simplify the development of parallel applications in the area of the scientific and engineering computations for distributed memory systems. The difficulties of this task lie not only in programming itself, but also in a complexity of supportive activities like debugging and performance analyses. We are developing a unifying framework where it is possible to create parallel applications and perform various supportive activities. The unifying element, that interconnects all these activities, is our visual model that is inspired by Colored Petri Nets. It is used to define the parallel behavior and the same model is used to show the inner state of the developed application back to the user. This paper presents how to extend this approach for debugging, tracing, and performance predictions. It also presents benefits obtained by their interconnection. The presented ideas are integrated into our open source tool Kaira (http://verif.cs.vsb.cz/kaira). Kaira is a prototyping tool, where a user can implement his/her ideas and experiment with them in a short time, create a real running program and verify its performance and scalability.


Sign in / Sign up

Export Citation Format

Share Document