Performance Modeling based on Multidimensional Surface Learning for Performance Predictions of Parallel Applications in Non-Dedicated Environments

Workload Warriors: Lessons Learned from a Decade of Mental Workload Prediction Using Human Performance Modeling

Proceedings of the Human Factors and Ergonomics Society Annual Meeting ◽

10.1177/154193120905301212 ◽

2009 ◽

Vol 53 (12) ◽

pp. 819-823 ◽

Cited By ~ 1

Author(s):

Diane Kuhl Mitchell ◽

Charneta Samms

Keyword(s):

Human Behavior ◽

Human Performance ◽

Performance Modeling ◽

Mental Workload ◽

Lessons Learned ◽

Experimental Conditions ◽

Workload Prediction ◽

Army Research Laboratory ◽

Performance Predictions ◽

Concept Vehicle

For at least a decade, researchers at the Army Research Laboratory (ARL) have predicted mental workload using human performance modeling (HPM) tools, primarily IMPRINT. During this timeframe their projects have matured from simple models of human behavior to complex analyses of the interactions of system design and human behavior. As part of this maturation process, the researchers learned: 1) to develop a modeling question that incorporates all aspects of workload, 2) to determine when workload is most likely to affect performance, 3) to build multiple models to represent experimental conditions, 4) to connect performance predictions to an overall mission or system capability, and 5) to format results in a clear, concise format. By implementing the techniques they developed from these lessons learned, the researchers have had an impact on major Army programs with their workload predictions. Specifically, they have successfully changed design requirements for future concept Army vehicles, substantiated manpower requirements for fielded Army vehicles, and made Soldier workload the number one item during preliminary design review for a major Army future concept vehicle program. The effective techniques the ARL researchers developed for their IMPRINT projects are applicable to other HPM tools. In addition, they can be used by students and researchers who are doing human performance modeling projects and are confronted with similar problems to help them achieve project success.

Download Full-text

A mechanism for balancing accuracy and scope in cross-machine black-box GPU performance modeling

The International Journal of High Performance Computing Applications ◽

10.1177/1094342020921340 ◽

2020 ◽

Vol 34 (6) ◽

pp. 589-614

Author(s):

James D Stevens ◽

Andreas Klöckner

Keyword(s):

Performance Optimization ◽

Heterogeneous Computing ◽

Performance Modeling ◽

Matrix Multiplication ◽

Black Box ◽

Ease Of Use ◽

Performance Tuning ◽

Parallel Applications ◽

Accuracy Evaluation ◽

Trade Offs

The ability to model, analyze, and predict execution time of computations is an important building block that supports numerous efforts, such as load balancing, benchmarking, job scheduling, developer-guided performance optimization, and the automation of performance tuning for high performance, parallel applications. In today’s increasingly heterogeneous computing environment, this task must be accomplished efficiently across multiple architectures, including massively parallel coprocessors like GPUs, which are increasingly prevalent in the world’s fastest supercomputers. To address this challenge, we present an approach for constructing customizable, cross-machine performance models for GPU kernels, including a mechanism to automatically and symbolically gather performance-relevant kernel operation counts, a tool for formulating mathematical models using these counts, and a customizable parameterized collection of benchmark kernels used to calibrate models to GPUs in a black-box fashion. With this approach, we empower the user to manage trade-offs between model accuracy, evaluation speed, and generalizability. A user can define their own model and customize the calibration process, making it as simple or complex as desired, and as application-targeted or general as desired. As application examples of our approach, we demonstrate both linear and nonlinear models; these examples are designed to predict execution times for multiple variants of a particular computation: two matrix-matrix multiplication variants, four discontinuous Galerkin differentiation operation variants, and two 2D five-point finite difference stencil variants. For each variant, we present accuracy results on GPUs from multiple vendors and hardware generations. We view this highly user-customizable approach as a response to a central question arising in GPU performance modeling: how can we model GPU performance in a cost-explanatory fashion while maintaining accuracy, evaluation speed, portability, and ease of use, an attribute we believe precludes approaches requiring manual collection of kernel or hardware statistics.

Download Full-text

Performance modeling of parallel applications on MPSoCs

2009 International Symposium on System-on-Chip ◽

10.1109/socc.2009.5335675 ◽

2009 ◽

Cited By ~ 5

Author(s):

Marco Lattuada ◽

Christian Pilato ◽

Antonino Tumeo ◽

Fabrizio Ferrandi

Keyword(s):

Performance Modeling ◽

Parallel Applications

Download Full-text

Teuta: Tool Support for Performance Modeling of Distributed and Parallel Applications

Computational Science - ICCS 2004 - Lecture Notes in Computer Science ◽

10.1007/978-3-540-24688-6_60 ◽

2004 ◽

pp. 456-463 ◽

Cited By ~ 9

Author(s):

Thomas Fahringer ◽

Sabri Pllana ◽

Johannes Testori

Keyword(s):

Performance Modeling ◽

Parallel Applications ◽

Tool Support

Download Full-text

Performance modeling of parallel applications for grid scheduling

Journal of Parallel and Distributed Computing ◽

10.1016/j.jpdc.2008.02.006 ◽

2008 ◽

Vol 68 (8) ◽

pp. 1135-1145 ◽

Cited By ~ 24

Author(s):

H.A. Sanjay ◽

Sathish Vadhiyar

Keyword(s):

Performance Modeling ◽

Parallel Applications ◽

Grid Scheduling

Download Full-text

Methods of inference and learning for performance modeling of parallel applications

Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming - PPoPP '07 ◽

10.1145/1229428.1229479 ◽

2007 ◽

Cited By ~ 84

Author(s):

Benjamin C. Lee ◽

David M. Brooks ◽

Bronis R. de Supinski ◽

Martin Schulz ◽

Karan Singh ◽

...

Keyword(s):

Performance Modeling ◽

Parallel Applications

Download Full-text

Visual programming of MPI applications: Debugging, performance analysis, and performance prediction

Computer Science and Information Systems ◽

10.2298/csis131204052b ◽

2014 ◽

Vol 11 (4) ◽

pp. 1315-1336 ◽

Cited By ~ 3

Author(s):

Stanislav Böhm ◽

Marek Bĕhálek ◽

Ondřej Meca ◽

Martin Surkovský

Keyword(s):

Performance Prediction ◽

Memory Systems ◽

Visual Programming ◽

Parallel Applications ◽

Colored Petri Nets ◽

Visual Model ◽

Performance Predictions ◽

Mpi Applications ◽

And Performance ◽

Short Time

In our research, we try to simplify the development of parallel applications in the area of the scientific and engineering computations for distributed memory systems. The difficulties of this task lie not only in programming itself, but also in a complexity of supportive activities like debugging and performance analyses. We are developing a unifying framework where it is possible to create parallel applications and perform various supportive activities. The unifying element, that interconnects all these activities, is our visual model that is inspired by Colored Petri Nets. It is used to define the parallel behavior and the same model is used to show the inner state of the developed application back to the user. This paper presents how to extend this approach for debugging, tracing, and performance predictions. It also presents benefits obtained by their interconnection. The presented ideas are integrated into our open source tool Kaira (http://verif.cs.vsb.cz/kaira). Kaira is a prototyping tool, where a user can implement his/her ideas and experiment with them in a short time, create a real running program and verify its performance and scalability.

Download Full-text