scholarly journals Enabling Fair Pricing on High Performance Computer Systems with Node Sharing

2014 ◽  
Vol 22 (2) ◽  
pp. 59-74 ◽  
Author(s):  
Alex D. Breslow ◽  
Ananta Tiwari ◽  
Martin Schulz ◽  
Laura Carrington ◽  
Lingjia Tang ◽  
...  

Co-location, where multiple jobs share compute nodes in large-scale HPC systems, has been shown to increase aggregate throughput and energy efficiency by 10–20%. However, system operators disallow co-location due to fair-pricing concerns, i.e., a pricing mechanism that considers performance interference from co-running jobs. In the current pricing model, application execution time determines the price, which results in unfair prices paid by the minority of users whose jobs suffer from co-location. This paper presents POPPA, a runtime system that enables fair pricing by delivering precise online interference detection and facilitates the adoption of supercomputers with co-locations. POPPA leverages a novel shutter mechanism – a cyclic, fine-grained interference sampling mechanism to accurately deduce the interference between co-runners – to provide unbiased pricing of jobs that share nodes. POPPA is able to quantify inter-application interference within 4% mean absolute error on a variety of co-located benchmark and real scientific workloads.

Author(s):  
Yifan Gao ◽  
Yang Zhong ◽  
Daniel Preoţiuc-Pietro ◽  
Junyi Jessy Li

In computational linguistics, specificity quantifies how much detail is engaged in text. It is an important characteristic of speaker intention and language style, and is useful in NLP applications such as summarization and argumentation mining. Yet to date, expert-annotated data for sentence-level specificity are scarce and confined to the news genre. In addition, systems that predict sentence specificity are classifiers trained to produce binary labels (general or specific).We collect a dataset of over 7,000 tweets annotated with specificity on a fine-grained scale. Using this dataset, we train a supervised regression model that accurately estimates specificity in social media posts, reaching a mean absolute error of 0.3578 (for ratings on a scale of 1-5) and 0.73 Pearson correlation, significantly improving over baselines and previous sentence specificity prediction systems. We also present the first large-scale study revealing the social, temporal and mental health factors underlying language specificity on social media.


2019 ◽  
Vol 7 (1) ◽  
pp. 55-70
Author(s):  
Moh. Zikky ◽  
M. Jainal Arifin ◽  
Kholid Fathoni ◽  
Agus Zainal Arifin

High-Performance Computer (HPC) is computer systems that are built to be able to solve computational loads. HPC can provide a high-performance technology and short the computing processes timing. This technology was often used in large-scale industries and several activities that require high-level computing, such as rendering virtual reality technology. In this research, we provide Tawaf’s Virtual Reality with 1000 of Pilgrims and realistic surroundings of Masjidil-Haram as the interactive and immersive simulation technology by imitating them with 3D models. Thus, the main purpose of this study is to calculate and to understand the processing time of its Virtual Reality with the implementation of tawaf activities using various platforms; such as computer and Android smartphone. The results showed that the outer-line or outer rotation of Kaa’bah mostly consumes minimum times although he must pass the longer distance than the closer one.  It happened because the agent with the closer area to Kaabah is facing the crowded peoples. It means an obstacle has the more impact than the distances in this case.


2013 ◽  
Vol 62 (2) ◽  
pp. 219-234 ◽  
Author(s):  
Gábor Szatmári ◽  
Annamária Laborczi ◽  
Gábor Illés ◽  
László Pásztor

Dolgozatunkban Zala megye feltalajainak szervesanyag-tartalmát kívántuk digitálisan térképezni regresszió krigeléssel a rendelkezésünkre álló Digitális Kreybig Talajinformációs Rendszer (DKTIR) adataira, illetve környezeti segédváltozókra alapozva. A térbeli kiterjesztések során különböző kombinációkban használtuk fel a talajképződési tényezőket, illetve DKTIR talajtérképi egységeit. Munkánk célja volt, hogy a regresszió krigelés modelljébe vont segédváltozó kombinációk minőségi hatását vizsgáljuk a becslő eljárás alapját jelentő többszörös lineáris regresszió modellre, illetve a becsült térkép pontosságára vonatkozóan.A szervesanyag-tartalom térbeli kiterjesztéséhez szükséges segédváltozókat a szakirodalom alapján választottuk ki. Segédadatként használtuk fel Zala megye digitális domborzatmodelljét, a 2009 és 2011 között készült MODIS műholdképekből származtatott vegetációs index állományokat, két klímaparaméter fedvényét, illetve a DKTIR talajtérképi egységeit. A regresszió krigeléssel becsült humusztartalom térképeket a DKTIR talajszelvény adataiból előzetesen leválogatott, a becslési eljárástól független kontroll adatpontokkal értékeltük. A validációhoz származtattuk a ME (Mean Error), a MAE (Mean Absolute Error), az RMSE (Root Mean Square Error), illetve az RIi(%) (Relative Improvement) paraméterek értékeit, ahol utóbbi az egyes térképek pontosságának relatív növekedését fejezi ki egy viszonyítási alapnak választott térképhez képest.A vizsgalati eredmenyek alapjan a terbeli talajinformaciok segedadatkent torten. felhasznalasa jelentősen novelte a regresszio modellek determinacios koefficienseinek erteket, illetve a becsult humuszterkepek pontossagat. A talajtani segedinformaciokat is figyelembe vevő regresszio modellek R2 ertekei — ket eset kivetelevel — joval meghaladtak a 30%-ot, vagyis a szervesanyag-tartalom terbeli valtozekonysaganak tobb mint egyharmadat voltak kepesek determinalni. A validacios mutatok alapjan azon terkepek bizonyultak pontosabbnak, melyekben a DKTIR talajok textura es vizgazdalkodasi tulajdonsagait (DKTIR-F) hasznaltuk fel talajtani segedvaltozokent. A legalacsonyabb MAE ertekkel (0,747) a domborzati es eghajlati talajkepző tenyezőket, illetve a DKTIR-F talajterkepi egyseget segedvaltozokent alkalmazo humuszterkep rendelkezett, ezen terkep eseten az RIi(%) parameter erteke 21%-nak adodott. A mutatok alapjan ezen terkep adta a legpontosabb becslest a mintaterulet szervesanyag-tartalmara, hisz a felhasznalt segedvaltozokon keresztul figyelembe veszi a mintaterulet szervesanyag-tartalmat alapvetően befolyasolo eroziot es akkumulaciot, illetve a talajok fizikai feleseget, mely utobbi hatassal van a vizhaztartasra, a beszivargasra, a kilugozasra es ezeken keresztul a humuszkepződes folyamatara. A biologiai talajkepző tenyezőt reprezentalo MODIS vegetacios index allomanyok eseteben megfigyelhető volt, hogy segedadatkent tortenő alkalmazasuk eseten kevesbe pontos becsleseket kaptunk osszevetve az ezen segedadatokat mellőző becslesekkel.Munkánkat a K105167 számú OTKA, illetve a TÁMOP-4.2.2.A-11/1/KONV-2012-0013. pályázatok támogatják.


BMC Genomics ◽  
2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Yang Yang ◽  
Xuesong Ding ◽  
Guanchen Zhu ◽  
Abhishek Niroula ◽  
Qiang Lv ◽  
...  

Abstract Background Stability is one of the most fundamental intrinsic characteristics of proteins and can be determined with various methods. Characterization of protein properties does not keep pace with increase in new sequence data and therefore even basic properties are not known for far majority of identified proteins. There have been some attempts to develop predictors for protein stabilities; however, they have suffered from small numbers of known examples. Results We took benefit of results from a recently developed cellular stability method, which is based on limited proteolysis and mass spectrometry, and developed a machine learning method using gradient boosting of regression trees. ProTstab method has high performance and is well suited for large scale prediction of protein stabilities. Conclusions The Pearson’s correlation coefficient was 0.793 in 10-fold cross validation and 0.763 in independent blind test. The corresponding values for mean absolute error are 0.024 and 0.036, respectively. Comparison with a previously published method indicated ProTstab to have superior performance. We used the method to predict stabilities of all the remaining proteins in the entire human proteome and then correlated the predicted stabilities to protein chain lengths of isoforms and to localizations of proteins.


2020 ◽  
Vol 245 ◽  
pp. 05042
Author(s):  
Miha Muškinja ◽  
Paolo Calafiura ◽  
Charles Leggett ◽  
Illya Shapoval ◽  
Vakho Tsulaia

The ATLAS experiment has successfully integrated HighPerformance Computing resources (HPCs) in its production system. Unlike the current generation of HPC systems, and the LHC computing grid, the next generation of supercomputers is expected to be extremely heterogeneous in nature: different systems will have radically different architectures, and most of them will provide partitions optimized for different kinds of workloads. In this work we explore the applicability of concepts and tools realized in Ray (the high-performance distributed execution framework targeting large-scale machine learning applications) to ATLAS event throughput optimization on heterogeneous distributed resources, ranging from traditional grid clusters to Exascale computers. We present a prototype of Raythena, a Ray-based implementation of the ATLAS Event Service (AES), a fine-grained event processing workflow aimed at improving the efficiency of ATLAS workflows on opportunistic resources, specifically HPCs. The AES is implemented as an event processing task farm that distributes packets of events to several worker processes running on multiple nodes. Each worker in the task farm runs an event-processing application (Athena) as a daemon. The whole system is orchestrated by Ray, which assigns work in a distributed, possibly heterogeneous, environment. For all its flexibility, the AES implementation is currently comprised of multiple separate layers that communicate through ad-hoc command-line and filebased interfaces. The goal of Raythena is to integrate these layers through a feature-rich, efficient application framework. Besides increasing usability and robustness, a vertically integrated scheduler will enable us to explore advanced concepts such as dynamically shaping of workflows to exploit currently available resources, particularly on heterogeneous systems.


2021 ◽  
Vol 7 ◽  
pp. e552
Author(s):  
Shubai Chen ◽  
Song Wu ◽  
Li Wang

Due to the high efficiency of hashing technology and the high abstraction of deep networks, deep hashing has achieved appealing effectiveness and efficiency for large-scale cross-modal retrieval. However, how to efficiently measure the similarity of fine-grained multi-labels for multi-modal data and thoroughly explore the intermediate layers specific information of networks are still two challenges for high-performance cross-modal hashing retrieval. Thus, in this paper, we propose a novel Hierarchical Semantic Interaction-based Deep Hashing Network (HSIDHN) for large-scale cross-modal retrieval. In the proposed HSIDHN, the multi-scale and fusion operations are first applied to each layer of the network. A Bidirectional Bi-linear Interaction (BBI) policy is then designed to achieve the hierarchical semantic interaction among different layers, such that the capability of hash representations can be enhanced. Moreover, a dual-similarity measurement (“hard” similarity and “soft” similarity) is designed to calculate the semantic similarity of different modality data, aiming to better preserve the semantic correlation of multi-labels. Extensive experiment results on two large-scale public datasets have shown that the performance of our HSIDHN is competitive to state-of-the-art deep cross-modal hashing methods.


2021 ◽  
Author(s):  
Depeng Zuo ◽  
Guangyuan Kan ◽  
Hongquan Sun ◽  
Hongbin Zhang ◽  
Ke Liang

Abstract. The Generalized Likelihood Uncertainty Estimation (GLUE) method has been thrived for decades, huge number of applications in the field of hydrological model have proved its effectiveness in uncertainty and parameter estimation. However, for many years, the poor computational efficiency of GLUE hampers its further applications. A feasible way to solve this problem is the integration of modern CPU-GPU hybrid high performance computer cluster technology to accelerate the traditional GLUE method. In this study, we developed a CPU-GPU hybrid computer cluster-based highly parallel large-scale GLUE method to improve its computational efficiency. The Intel Xeon multi-core CPU and NVIDIA Tesla many-core GPU were adopted in this study. The source code was developed by using the MPICH2, C++ with OpenMP 2.0, and CUDA 6.5. The parallel GLUE method was tested by a widely-used hydrological model (the Xinanjiang model) to conduct performance and scalability investigation. Comparison results indicated that the parallel GLUE method outperformed the traditional serial method and have good application prospect on super computer clusters such as the ORNL Summit and Sierra of the TOP500 super computers around the world.


Sign in / Sign up

Export Citation Format

Share Document