Locality-Aware Task Scheduling and Data Distribution for OpenMP Programs on NUMA Systems and Manycore Processors

Scientific Programming ◽

10.1155/2015/981759 ◽

2015 ◽

Vol 2015 ◽

pp. 1-16 ◽

Cited By ~ 7

Author(s):

Ananya Muddukrishna ◽

Peter A. Jonsson ◽

Mats Brorsson

Keyword(s):

Task Scheduling ◽

Data Distribution ◽

Data Access ◽

Improve Performance ◽

Manycore Processors ◽

Cache Access ◽

On Chip ◽

Processor Caches ◽

Architectural Knowledge ◽

The Impact

Performance degradation due to nonuniform data access latencies has worsened on NUMA systems and can now be felt on-chip in manycore processors. Distributing data across NUMA nodes and manycore processor caches is necessary to reduce the impact of nonuniform latencies. However, techniques for distributing data are error-prone and fragile and require low-level architectural knowledge. Existing task scheduling policies favor quick load-balancing at the expense of locality and ignore NUMA node/manycore cache access latencies while scheduling. Locality-aware scheduling, in conjunction with or as a replacement for existing scheduling, is necessary to minimize NUMA effects and sustain performance. We present a data distribution and locality-aware scheduling technique for task-based OpenMP programs executing on NUMA systems and manycore processors. Our technique relieves the programmer from thinking of NUMA system/manycore processor architecture details by delegating data distribution to the runtime system and uses task data dependence information to guide the scheduling of OpenMP tasks to reduce data stall times. We demonstrate our technique on a four-socket AMD Opteron machine with eight NUMA nodes and on the TILEPro64 processor and identify that data distribution and locality-aware task scheduling improve performance up to 69% for scientific benchmarks compared to default policies and yet provide an architecture-oblivious approach for programmers.

Download Full-text

On the impact of dynamic task scheduling in heterogeneous MPSoCs

2011 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation ◽

10.1109/samos.2011.6045440 ◽

2011 ◽

Cited By ~ 11

Author(s):

Oliver Arnold ◽

Gerhard Fettweis

Keyword(s):

Task Scheduling ◽

Dynamic Task ◽

The Impact ◽

Dynamic Task Scheduling

Download Full-text

Thermographic Study of Chip Temperature in High-Speed Dry Milling Magnesium Alloys

Management and Production Engineering Review ◽

10.1515/mper-2016-0020 ◽

2016 ◽

Vol 7 (2) ◽

pp. 86-92 ◽

Cited By ~ 5

Author(s):

Józef Kuczmaszewski ◽

Ireneusz Zagórski ◽

Piotr Zgórniak

Keyword(s):

Temperature Measurement ◽

Magnesium Alloys ◽

High Speed ◽

Machining Process ◽

Dry Milling ◽

Technological Parameters ◽

Chip Temperature ◽

Thermographic Study ◽

On Chip ◽

The Impact

Abstract This paper presents an overview of the state of knowledge on temperature measurement in the cutting area during magnesium alloy milling. Additionally, results of own research on chip temperature measurement during dry milling of magnesium alloys are included. Tested magnesium alloys are frequently used for manufacturing elements applied in the aerospace industry. The impact of technological parameters on the maximum chip temperature during milling is also analysed. This study is relevant due to the risk of chip ignition during the machining process.

Download Full-text

Characterizing and Mitigating Work Time Inflation in Task Parallel Programs

Scientific Programming ◽

10.1155/2013/898597 ◽

2013 ◽

Vol 21 (3-4) ◽

pp. 123-136 ◽

Cited By ~ 1

Author(s):

Stephen L. Olivier ◽

Bronis R. de Supinski ◽

Martin Schulz ◽

Jan F. Prins

Keyword(s):

Poor Performance ◽

Data Access ◽

Parallel Applications ◽

Task Parallelism ◽

Improve Performance ◽

Additional Time ◽

Work Time ◽

Task Parallel ◽

Time Required ◽

Data Access Latency

Task parallelism raises the level of abstraction in shared memory parallel programming to simplify the development of complex applications. However, task parallel applications can exhibit poor performance due to thread idleness, scheduling overheads, andwork time inflation– additional time spent by threads in a multithreaded computation beyond the time required to perform the same work in a sequential computation. We identify the contributions of each factor to lost efficiency in various task parallel OpenMP applications and diagnose the causes of work time inflation in those applications. Increased data access latency can cause significant work time inflation in NUMA systems. Our locality framework for task parallel OpenMP programs mitigates this cause of work time inflation. Our extensions to the Qthreads library demonstrate that locality-aware scheduling can improve performance up to 3X compared to the Intel OpenMP task scheduler.

Download Full-text

Supply Chain Integration (SCI) Fashion Products Made By SMEsIn Response to Improve Performance Development Of Tourism In West Java

Jurnal Manajemen Pelayanan Publik ◽

10.24198/jmpp.v4i1.25449 ◽

2020 ◽

Vol 4 (1) ◽

pp. 25

Author(s):

Imam Suwandi ◽

Erna Maulina ◽

Tetty Herawati

Keyword(s):

Supply Chain ◽

Supply Chain Integration ◽

Open Door ◽

Improve Performance ◽

West Java ◽

Fashion Products ◽

Performance Development ◽

Travel Industry ◽

The Impact ◽

The City

The advancement of the travel industry in West Java can be an open door for MSMEs in the city of Bandung to address the issues of sightseers and affect expanding pay for MSMEs. In light of the advancement of the travel industry required arrangements and fitting ways for SMEs to answer these difficulties. Collaboration in gracefully bind the executives is thought to be a fitting action to improve hierarchical execution and increment upper hand. Gracefully Chain Incorporation (SCI) can influence authoritative execution. This article considers the effect of Gracefully Chain Mix systems on authoritative execution and investigates the impact of SCI on hierarchical execution in Miniaturized scale, Little and Medium Undertakings (MSMEs) on design items in West Java. This article utilizes a poll that was created with an approved estimation scale from past investigations and exact information was gathered through a study survey from 207 MSMEs utilizing likelihood testing. This exploration is a quantitative report with investigation utilizing SEM-PLS. This examination gives a suggestion to MSMEs in West Java specifically.

Download Full-text

Smart Caching at CMS: applying AI to XCache edge services

EPJ Web of Conferences ◽

10.1051/epjconf/202024504024 ◽

2020 ◽

Vol 245 ◽

pp. 04024

Author(s):

Daniele Spiga ◽

Diego Ciangottini ◽

Mirco Tracolli ◽

Tommaso Tedeschi ◽

Daniele Cesini ◽

...

Keyword(s):

National Level ◽

Data Access ◽

Analysis Data ◽

Technical Solution ◽

Early Results ◽

Lake Model ◽

Geographically Distributed ◽

Computing Centers ◽

Working Set ◽

The Impact

The projected Storage and Compute needs for the HL-LHC will be a factor up to 10 above what can be achieved by the evolution of current technology within a flat budget. The WLCG community is studying possible technical solutions to evolve the current computing in order to cope with the requirements; one of the main focus is resource optimization, with the ultimate aim of improving performance and efficiency, as well as simplifying and reducing operation costs. As of today the storage consolidation based on a Data Lake model is considered a good candidate for addressing HL-LHC data access challenges. The Data Lake model under evaluation can be seen as a logical system that hosts a distributed working set of analysis data. Compute power can be “close” to the lake, but also remote and thus completely external. In this context we expect data caching to play a central role as a technical solution to reduce the impact of latency and reduce network load. A geographically distributed caching layer will be functional to many satellite computing centers that might appear and disappear dynamically. In this talk we propose a system of caches, distributed at national level, describing both deployment and results of the studies made to measure the impact on the CPU efficiency. In this contribution, we also present the early results on novel caching strategy beyond the standard XRootD approach whose results will be a baseline for an AI-based smart caching system.

Download Full-text

A Semantic Framework to Improve Interoperability of Malaria Surveillance Systems

Online Journal of Public Health Informatics ◽

10.5210/ojphi.v10i1.8987 ◽

2018 ◽

Vol 10 (1) ◽

Cited By ~ 1

Author(s):

Jon Hael Simon Brenas ◽

Mohammad S. Al-Manir ◽

Kate Zinszer ◽

Christopher J. Baker ◽

Arash Shaban-Nejad

Keyword(s):

Semantic Web ◽

Insecticide Resistance ◽

Data Access ◽

Data Sources ◽

Surveillance Systems ◽

Malaria Surveillance ◽

Semantic Framework ◽

Ontology Language ◽

The Impact ◽

Target Data

ObjectiveMalaria is one of the top causes of death in Africa and some other regions in the world. Data driven surveillance activities are essential for enabling the timely interventions to alleviate the impact of the disease and eventually eliminate malaria. Improving the interoperability of data sources through the use of shared semantics is a key consideration when designing surveillance systems, which must be robust in the face of dynamic changes to one or more components of a distributed infrastructure. Here we introduce a semantic framework to improve interoperability of malaria surveillance systems (SIEMA).IntroductionIn 2015, there were 212 million new cases of malaria, and about 429,000 malaria death, worldwide. African countries accounted for almost 90% of global cases of malaria and 92% of malaria deaths. Currently, malaria data are scattered across different countries, laboratories, and organizations in different heterogeneous data formats and repositories. The diversity of access methodologies makes it difficult to retrieve relevant data in a timely manner. Moreover, lack of rich metadata limits the reusability of data and its integration. The current process of discovering, accessing and reusing the data is inefficient and error-prone profoundly hindering surveillance efforts.As our knowledge about malaria and appropriate preventive measures becomes more comprehensive malaria data management systems, data collection standards, and data stewardship are certain to change regularly. Collectively these changes will make it more difficult to perform accurate data analytics or achieve reliable estimates of important metrics, such as infection rates. Consequently, there is a critical need to rapidly re-assess the integrity of data and knowledge infrastructures that experts depend on to support their surveillance tasks.MethodsIn order to address the challenge of heterogeneity of malaria data sources we recruit domain specific ontologies in the field (e.g. IDOMAL (1)) that define a shared lexicon of concepts and relations. These ontologies are expressed in the standard Web Ontology Language (OWL).To over come challenges in accessing distributed data resources we have adopted the Semantic Automatic Discovery & Integration framework (SADI) (2) to ensure interoperability. SADI provides a way to describe services that provide access to data, detailing inputs and outputs of services and a functional description. Existing ontology terms are used when building SADI Service descriptions. The services can be discovered by querying a registry and combined into complex workflows. Users can issue SPARQL syntax to a query engine which can plan complex workflows to fetch actual data, without having to know how target data is structured or where it is located.In order to tackle changes in target data sources, the ontologies or the service definitions, we create a Dashboard (3) that can report any changes. The Dashboard reuses some existing tools to perform a series of checks. These tools compare versions of ontologies and databases allowing the Dashboard to report these changes. Once a change has been identified, as series of recommendations can be made, e.g. services can be retired or updated so that data access can continue.ResultsWe used the Mosquito Insecticide Resistance Ontology (MIRO) (5) to define the common lexicon for our data sources and queries. The sources we created are CSV files that use the IRbase (4) schema. With the data defined using we specified several SPARQL queries and the SADI services needed to answer them. These services were designed to enabled access to the data separated in different files using different formats. In order to showcase the capabilities of our Dashboard, we also modified parts of the service definitions, of the ontology and of the data sources. This allowed us to test our change detection capabilities. Once changes where detected, we manually updated the services to comply with a revised ontology and data sources and checked that the changes we proposed where yielding services that gave the right answers. In the future, we plan to make the updating of the services automatic.ConclusionsBeing able to make the relevant information accessible to a surveillance expert in a seamless way is critical in tackling and ultimately curing malaria. In order to achieve this, we used existing ontologies and semantic web services to increase the interoperability of the various sources. The data as well as the ontologies being likely to change frequently, we also designed a tool allowing us to detect and identify the changes and to update the services so that the whole surveillance systems becomes more resilient.References1. P. Topalis, E. Mitraka, V Dritsou, E. Dialynas and C. Louis, “IDOMAL: the malaria ontology revisited” in Journal of Biomedical Semantics, vol. 4, no. 1, p. 16, Sep 2013.2. M. D. Wilkinson, B. Vandervalk and L. McCarthy, “The Semantic Automated Discovery and Integration (SADI) web service design-pattern, API and reference implementation” in Journal of Biomedical Semantics, vol. 2, no. 1, p. 8, 2011.3. J.H. Brenas, M.S. Al-Manir, C.J.O. Baker and A. Shaban-Nejad, “Change management dashboard for the SIEMA global surveillance infrastructure”, in International Semantic Web Conference, 20174. E. Dialynas, P. Topalis, J. Vontas and C. Louis, "MIRO and IRbase: IT Tools for the Epidemiological Monitoring of Insecticide Resistance in Mosquito Disease Vectors", in PLOS Neglected Tropical Diseases 2009

Download Full-text

The Quest to solve the HL-LHC data access puzzle

EPJ Web of Conferences ◽

10.1051/epjconf/202024504027 ◽

2020 ◽

Vol 245 ◽

pp. 04027

Author(s):

X. Espinal ◽

S. Jezequel ◽

M. Schulz ◽

A. Sciabà ◽

I. Vukotic ◽

...

Keyword(s):

Data Storage ◽

Working Group ◽

Data Access ◽

Current Data ◽

Resource Needs ◽

Lake Model ◽

New Concepts ◽

Depth Analysis ◽

Group Members ◽

The Impact

HL-LHC will confront the WLCG community with enormous data storage, management and access challenges. These are as much technical as economical. In the WLCG-DOMA Access working group, members of the experiments and site managers have explored different models for data access and storage strategies to reduce cost and complexity, taking into account the boundary conditions given by our community.Several of these scenarios have been evaluated quantitatively, such as the Data Lake model and incremental improvements of the current computing model with respect to resource needs, costs and operational complexity.To better understand these models in depth, analysis of traces of current data accesses and simulations of the impact of new concepts have been carried out. In parallel, evaluations of the required technologies took place. These were done in testbed and production environments at small and large scale.We will give an overview of the activities and results of the working group, describe the models and summarise the results of the technology evaluation focusing on the impact of storage consolidation in the form of Data Lakes, where the use of streaming caches has emerged as a successful approach to reduce the impact of latency and bandwidth limitation.We will describe the experience and evaluation of these approaches in different environments and usage scenarios. In addition we will present the results of the analysis and modelling efforts based on data access traces of the experiments.

Download Full-text

A Data Distribution Aware Task Scheduling Strategy for MapReduce System

Lecture Notes in Computer Science - Cloud Computing ◽

10.1007/978-3-642-10665-1_74 ◽

2009 ◽

pp. 694-699 ◽

Cited By ~ 4

Author(s):

Leitao Guo ◽

Hongwei Sun ◽

Zhiguo Luo

Keyword(s):

Task Scheduling ◽

Data Distribution ◽

Scheduling Strategy

Download Full-text

Study of the Impact of Protective Process Agents with Carbon Nanopowder Additives on Chip Forming Processes

Key Engineering Materials ◽

10.4028/www.scientific.net/kem.887.319 ◽

2021 ◽

Vol 887 ◽

pp. 319-324

Author(s):

E.A. Petrovsky ◽

K.A. Bashmur ◽

Vadim S. Tynchenko

Keyword(s):

Experimental Studies ◽

Cutting Speed ◽

Optimal Composition ◽

Cooling Process ◽

Shrinkage Ratio ◽

Nickel Chromium ◽

Chromium Alloys ◽

On Chip ◽

The Impact ◽

Positive Effect

The present study describes the impact of various protective process agents on chip forming processes. The research was conducted on NiCr20TiAl and 34NiCrMoV14-5 nickel-chromium alloys. New lubricant-cooling process agents with carbon nanopowder additives are studied. The optimal composition of the nanopowder additive and its effect during alloy cutting is examined. Experiments reveal the dependence of shrinkage ratio on cutting speed and various protective process agents. The values of H50 microhardness are also defined when cutting these alloys using protective process agents. Experimental studies found the positive effect of developed agents with nanopowder additives on the processes of NiCr20TiAl and 34NiCrMoV14-5 alloys chip formation.

Download Full-text

Designing of High Performance Multicore Processor with Improved Cache Configuration and Interconnect

Advances in Systems Analysis, Software Engineering, and High Performance Computing - Emerging Research Surrounding Power Consumption and Performance Issues in Utility Computing ◽

10.4018/978-1-4666-8853-7.ch009 ◽

2016 ◽

pp. 204-219

Author(s):

Ram Prasad Mohanty ◽

Ashok Kumar Turuk ◽

Bibhudatta Sahoo

Keyword(s):

High Performance ◽

Multicore Processors ◽

Multicore Processor ◽

Cache Size ◽

L2 Cache ◽

Internal Network ◽

On Chip ◽

L1 And L2 ◽

The Impact ◽

Cache Configuration

The growing number of cores increases the demand for a powerful memory subsystem which leads to enhancement in the size of caches in multicore processors. Caches are responsible for giving processing elements a faster, higher bandwidth local memory to work with. In this chapter, an attempt has been made to analyze the impact of cache size on performance of Multi-core processors by varying L1 and L2 cache size on the multicore processor with internal network (MPIN) referenced from NIAGRA architecture. As the number of core's increases, traditional on-chip interconnects like bus and crossbar proves to be low in efficiency as well as suffer from poor scalability. In order to overcome the scalability and efficiency issues in these conventional interconnect, ring based design has been proposed. The effect of interconnect on the performance of multicore processors has been analyzed and a novel scalable on-chip interconnection mechanism (INOC) for multicore processors has been proposed. The benchmark results are presented by using a full system simulator. Results show that, using the proposed INoC, compared with the MPIN; the execution time are significantly reduced.

Download Full-text