Scientific Computing on the Itanium® Processor

Bruce Greer; John Harrison; Greg Henry; Wei Li; Peter Tang

doi:10.1155/2002/193478

Scientific Computing on the Itanium® Processor

Scientific Programming ◽

10.1155/2002/193478 ◽

2002 ◽

Vol 10 (4) ◽

pp. 329-337 ◽

Cited By ~ 2

Author(s):

Bruce Greer ◽

John Harrison ◽

Greg Henry ◽

Wei Li ◽

Peter Tang

Keyword(s):

Linear Algebra ◽

High Performance ◽

Scientific Computing ◽

Peak Performance ◽

Low Level ◽

Architectural Features ◽

Transcendental Functions ◽

High Performance Computer ◽

High Level ◽

Engineering Computing

The 64-bit Intel® Itanium® architecture is designed for high-performance scientific and enterprise computing, and the Itanium processor is its first silicon implementation. Features such as extensive arithmetic support, predication, speculation, and explicit parallelism can be used to provide a sound infrastructure for supercomputing. A large number of high-performance computer companies are offering Itanium® -based systems, some capable of peak performance exceeding 50 GFLOPS. In this paper we give an overview of the most relevant architectural features and provide illustrations of how these features are used in both low-level and high-level support for scientific and engineering computing, including transcendental functions and linear algebra kernels.

Download Full-text

High-Level Parallel Ant Colony Optimization with Algorithmic Skeletons

International Journal of Parallel Programming ◽

10.1007/s10766-021-00714-1 ◽

2021 ◽

Author(s):

Breno A. de Melo Menezes ◽

Nina Herrmann ◽

Herbert Kuchen ◽

Fernando Buarque de Lima Neto

Keyword(s):

Ant Colony Optimization ◽

High Performance ◽

Optimization Problems ◽

Programming Model ◽

Parallel Implementation ◽

Ant Colony ◽

Algorithmic Skeletons ◽

Low Level ◽

Programming Patterns ◽

High Level

AbstractParallel implementations of swarm intelligence algorithms such as the ant colony optimization (ACO) have been widely used to shorten the execution time when solving complex optimization problems. When aiming for a GPU environment, developing efficient parallel versions of such algorithms using CUDA can be a difficult and error-prone task even for experienced programmers. To overcome this issue, the parallel programming model of Algorithmic Skeletons simplifies parallel programs by abstracting from low-level features. This is realized by defining common programming patterns (e.g. map, fold and zip) that later on will be converted to efficient parallel code. In this paper, we show how algorithmic skeletons formulated in the domain specific language Musket can cope with the development of a parallel implementation of ACO and how that compares to a low-level implementation. Our experimental results show that Musket suits the development of ACO. Besides making it easier for the programmer to deal with the parallelization aspects, Musket generates high performance code with similar execution times when compared to low-level implementations.

Download Full-text

Performance Analysis of Specification Computer and Mobile with Implementation Tawaf Virtual Reality using A* Algorithm and RVO System

EMITTER International Journal of Engineering Technology ◽

10.24003/emitter.v7i1.321 ◽

2019 ◽

Vol 7 (1) ◽

pp. 55-70

Author(s):

Moh. Zikky ◽

M. Jainal Arifin ◽

Kholid Fathoni ◽

Agus Zainal Arifin

Keyword(s):

Virtual Reality ◽

High Performance ◽

Large Scale ◽

3D Models ◽

A Algorithm ◽

Virtual Reality Technology ◽

Performance Technology ◽

Outer Line ◽

High Performance Computer ◽

High Level

High-Performance Computer (HPC) is computer systems that are built to be able to solve computational loads. HPC can provide a high-performance technology and short the computing processes timing. This technology was often used in large-scale industries and several activities that require high-level computing, such as rendering virtual reality technology. In this research, we provide Tawafâ€™s Virtual Reality with 1000 of Pilgrims and realistic surroundings of Masjidil-Haram as the interactive and immersive simulation technology by imitating them with 3D models. Thus, the main purpose of this study is to calculate and to understand the processing time of its Virtual Reality with the implementation of tawaf activities using various platforms; such as computer and Android smartphone. The results showed that the outer-line or outer rotation of Kaaâ€™bah mostly consumes minimum times although he must pass the longer distance than the closer one. Â It happened because the agent with the closer area to Kaabah is facing the crowded peoples. It means an obstacle has the more impact than the distances in this case.

Download Full-text

Memristive Accelerators for Dense and Sparse Linear Algebra: From Machine Learning to High-Performance Scientific Computing

IEEE Micro ◽

10.1109/mm.2018.2885498 ◽

2019 ◽

Vol 39 (1) ◽

pp. 58-61

Author(s):

Engin Ipek

Keyword(s):

Machine Learning ◽

Linear Algebra ◽

High Performance ◽

Scientific Computing ◽

Sparse Linear Algebra

Download Full-text

EM3DANI: A Julia package for fully anisotropic 3D forward modeling of electromagnetic data

Geophysics ◽

10.1190/geo2020-0489.1 ◽

2021 ◽

pp. 1-45

Author(s):

Ronghua Peng ◽

Bo Han ◽

Yajun Liu ◽

Xiangyun Hu

Keyword(s):

High Performance ◽

Three Dimensional ◽

Anisotropic Media ◽

Forward Modeling ◽

Hydrocarbon Exploration ◽

Third Party ◽

Low Level ◽

Numerical Computing ◽

Computationally Intensive ◽

High Level

Forward modeling is vital for three-dimensional (3D) inversion and interpretation of electromagnetic (EM) data in anisotropic media, which is one of the major challenges in the field of EM geophysics. However, there are few freely available 3D codes that are capable of modeling EM responses in fully anisotropic media. Besides, most of the existing 3D EM codes are written in low-level languages such as C and Fortran, making them difficult to read, maintain and extend. Taking advantage of recent progress in computer technology and numerical methods, we have developed an open-source package for forward modeling of frequency-domain EM fields in a fully 3D anisotropic earth (EM3DANI) using the Julia language, a relatively young, high-level programming language with a focus on high performance. Based on a mimetic finite-volume (MFV) discretization of the governing equations, the modeling algorithm is expressed in an abstract form in terms of matrices/vectors and thus can be easily implemented by using any high-level language commonly-used for numerical computing. Existing libraries written in low-level languages can be easily integrated into a Julia code without the so-called two-language problem, thus we have exploited several mature third-party packages to deal with computationally intensive parts of the forward modeling, which guarantees high stability and efficiency. We have elaborated the structure of the package, paying special attention to code usability, readability and extendability, while striving to retain versatility and high performance. The effectiveness of the code is demonstrated through two 1D synthetic examples for magnetotellurics (MT) and controlled-source electromagnetics (CSEM) problems, respectively. High accuracy and efficiency can be achieved for both 1D examples. We further present a 3D example mimicking marine CSEM survey scenario for hydrocarbon exploration. The simulation results indicate that the effect of the anisotropy on forward responses is significant, and can be comparable to that of the target reservoir.

Download Full-text

EXECUTION OF SEQUENTIAL AND PARALLEL JAVA BYTECODE IN A METACOMPUTING SYSTEM

Parallel Processing Letters ◽

10.1142/s0129626403001148 ◽

2003 ◽

Vol 13 (01) ◽

pp. 53-64 ◽

Cited By ~ 1

Author(s):

ERIC GAMESS

Keyword(s):

Linear Algebra ◽

Virtual Machine ◽

Message Passing ◽

High Performance ◽

Scientific Computing ◽

Message Passing Interface ◽

Java Virtual Machine ◽

Parallel Applications ◽

Beowulf Cluster ◽

Java Bytecode

In this paper, we address the goal of executing Java parallel applications in a group of nodes of a Beowulf cluster transparently chosen by a metacomputing system oriented to efficient execution of Java bytecode, with support for scientific computing. To this end, we extend the Java virtual machine by providing a message passing interface and quick access to distributed high performance resources. Also, we introduce the execution of parallel linear algebra methods for large objects from sequential Java applications by invoking SPLAM, our parallel linear algebra package.

Download Full-text

Linnea

ACM Transactions on Mathematical Software ◽

10.1145/3446632 ◽

2021 ◽

Vol 47 (3) ◽

pp. 1-26

Author(s):

Henrik Barthels ◽

Christos Psarras ◽

Paolo Bientinesi

Keyword(s):

Linear Algebra ◽

High Performance ◽

Search Algorithm ◽

Test Problems ◽

Code Generator ◽

Matrix Computations ◽

Significant Performance ◽

High Level ◽

Almost All ◽

Performance Computing

The translation of linear algebra computations into efficient sequences of library calls is a non-trivial task that requires expertise in both linear algebra and high-performance computing. Almost all high-level languages and libraries for matrix computations (e.g., Matlab, Eigen) internally use optimized kernels such as those provided by BLAS and LAPACK; however, their translation algorithms are often too simplistic and thus lead to a suboptimal use of said kernels, resulting in significant performance losses. To combine the productivity offered by high-level languages, and the performance of low-level kernels, we are developing Linnea, a code generator for linear algebra problems. As input, Linnea takes a high-level description of a linear algebra problem; as output, it returns an efficient sequence of calls to high-performance kernels. Linnea uses a custom best-first search algorithm to find a first solution in less than a second, and increasingly better solutions when given more time. In 125 test problems, the code generated by Linnea almost always outperforms Matlab, Julia, Eigen, and Armadillo, with speedups up to and exceeding 10×.

Download Full-text

Brian 2, an intuitive and efficient neural simulator

eLife ◽

10.7554/elife.47314 ◽

2019 ◽

Vol 8 ◽

Cited By ~ 41

Author(s):

Marcel Stimberg ◽

Romain Brette ◽

Dan FM Goodman

Keyword(s):

Code Generation ◽

High Performance ◽

Network Models ◽

Neural Network Models ◽

Dynamical Equations ◽

Low Level ◽

Pyloric Network ◽

Experimental Protocols ◽

Description Languages ◽

High Level

Brian 2 allows scientists to simply and efficiently simulate spiking neural network models. These models can feature novel dynamical equations, their interactions with the environment, and experimental protocols. To preserve high performance when defining new models, most simulators offer two options: low-level programming or description languages. The first option requires expertise, is prone to errors, and is problematic for reproducibility. The second option cannot describe all aspects of a computational experiment, such as the potentially complex logic of a stimulation protocol. Brian addresses these issues using runtime code generation. Scientists write code with simple and concise high-level descriptions, and Brian transforms them into efficient low-level code that can run interleaved with their code. We illustrate this with several challenging examples: a plastic model of the pyloric network, a closed-loop sensorimotor model, a programmatic exploration of a neuron model, and an auditory model with real-time input.

Download Full-text

Accessible C-programming course from scratch using a MOOC platform without limitations

Proceedings of the 4th International Conference on Higher Education Advances (HEAd'18) ◽

10.4995/head18.2018.8176 ◽

2018 ◽

Author(s):

Jose Antonio Belloch ◽

Adrián Castelló ◽

Sergio Iserte

Keyword(s):

Computer Science ◽

Computer Architecture ◽

High Performance ◽

Application Development ◽

C Language ◽

Low Level ◽

System A ◽

Science Degree ◽

C Programming ◽

High Level

The C language has been used for ages in the application development in multidisciplinary environments. However, in the academia, this language is being replaced by other higher-level languages due to they are easier to understand, learn and apply. Moreover, the necessity of professionals with a good knowledge in those high-level languages is constantly increasing because of the boosting of mobile devices. This scenario generates a lack of low-level language programmers, required in other less trendy fields, but equal or more important, such as science, engineering or research. In order to revive the interest in low-level languages and provide those minority fields with well-prepared staff, we present in this work a MOCC C-programming course that is addressed to any kind of people with or without IT background. A feature that differentiates this course from others programming online-based courses is that we mainly focus on the C language syntax providing, via a self-tuned virtual machine, an encapsulated environment that hides any interaction with the command-line of the underlying operating system. A secondary target of this work is to foster the computer science degree students to enrol the computer architecture specialization at the Universitat Jaume I (Spain). For this purpose, the High Performance Computing and Architectures research group of that University has decided to use this C course as a tool for fulfill the gap of the current syllabus. The results show that half of the participants that completed the first session of the course have satisfactorily finished the course, and the number of computer science degree students that chose the computer architecture specialization the following academic course was increment by 3x.

Download Full-text

A Python-based optimization framework for high-performance genomics

10.1101/2020.10.29.361402 ◽

2020 ◽

Author(s):

Ariya Shajii ◽

Ibrahim Numanagić ◽

Alexander T. Leighton ◽

Haley Greenyer ◽

Saman Amarasinghe ◽

...

Keyword(s):

High Performance ◽

Optimization Techniques ◽

Computational Genomics ◽

Next Generation Sequencing Data ◽

Sequencing Data ◽

Low Level ◽

Performance Improvements ◽

Optimization Framework ◽

Novice Programmer ◽

High Level

AbstractExponentially-growing next-generation sequencing data requires high-performance tools and algorithms. Nevertheless, the implementation of high-performance computational genomics software is inaccessible to many scientists because it requires extensive knowledge of low-level software optimization techniques, forcing scientists to resort to high-level software alternatives that are less efficient. Here, we introduce Seq—a Python-based optimization framework that combines the power and usability of high-level languages like Python with the performance of low-level languages like C or C++. Seq allows for shorter, simpler code, is readily usable by a novice programmer, and obtains significant performance improvements over existing languages and frameworks. We showcase and evaluate Seq by implementing seven standard, widely-used applications from all stages of the genomics analysis pipeline, including genome index construction, finding maximal exact matches, long-read alignment and haplotype phasing, and demonstrate its implementations are up to an order of magnitude faster than existing hand-optimized implementations, with just a fraction of the code. By enabling researchers of all backgrounds to easily implement high-performance analysis tools, Seq further opens the door to the democratization and scalability of computational genomics.

Download Full-text

High-level vs low-level parallel programming for scientific computing

Proceedings 16th International Parallel and Distributed Processing Symposium ◽

10.1109/ipdps.2002.1016644 ◽

2002 ◽

Author(s):

Yi Pan

Keyword(s):

Parallel Programming ◽

Scientific Computing ◽

Low Level ◽

High Level

Download Full-text