A GPU Scheduling Framework to Accelerate Hyper-Parameter Optimization in Deep Learning Clusters

Jaewon Son; Yonghyuk Yoo; Khu-rai Kim; Youngjae Kim; Kwonyong Lee; Sungyong Park

doi:10.3390/electronics10030350

A GPU Scheduling Framework to Accelerate Hyper-Parameter Optimization in Deep Learning Clusters

Electronics ◽

10.3390/electronics10030350 ◽

2021 ◽

Vol 10 (3) ◽

pp. 350

Author(s):

Jaewon Son ◽

Yonghyuk Yoo ◽

Khu-rai Kim ◽

Youngjae Kim ◽

Kwonyong Lee ◽

...

Keyword(s):

Deep Learning ◽

Parameter Optimization ◽

Round Robin ◽

Time Sharing ◽

Scheduling Policies ◽

Scheduling Policy ◽

Minimal Time ◽

Gpu Scheduling ◽

Early Phases

This paper proposes Hermes, a container-based preemptive GPU scheduling framework for accelerating hyper-parameter optimization in deep learning (DL) clusters. Hermes accelerates hyper-parameter optimization by time-sharing between DL jobs and prioritizing jobs with more promising hyper-parameter combinations. Hermes’s scheduling policy is grounded on the observation that good hyper-parameter combinations converge quickly in the early phases of training. By giving higher priority to fast-converging containers, Hermes’s GPU preemption mechanism can accelerate training. This enables users to find optimal hyper-parameters faster without losing the progress of a container. We have implemented Hermes over Kubernetes and compared its performance against existing scheduling frameworks. Experiments show that Hermes reduces the time for hyper-parameter optimization up to 4.04 times against previously proposed scheduling policies such as FIFO, round-robin (RR), and SLAQ, with minimal time-sharing overhead.

Download Full-text

REAL-TIME SCHEDULING POLICY IN EMBEDDED SYSTEM DOMAIN: A FRAME WORK

International Journal of Computer Science and Informatics ◽

10.47893/ijcsi.2013.1130 ◽

2013 ◽

pp. 124-129

Author(s):

ASHIS KUMAR MISHRA ◽

ZULFIKHAR AHMAD ◽

YOGAMAYA MOHAPATRA ◽

ANIL KUMAR MISHRA

Keyword(s):

Embedded System ◽

Real Time ◽

Classical Problem ◽

Time Sharing ◽

Scheduling Policies ◽

Real Time Scheduling ◽

Cpu Scheduling ◽

Scheduling Policy ◽

Time Scheduling ◽

Time Systems

Scheduling a sequence of jobs released over time when the processing time of a job is only known at its completion is a classical problem in CPU scheduling in time-sharing and real time operating system. We discuss here different scheduling techniques used in Real-Time systems. Even if there are several scheduling policies, the preemptive scheduling policies hold promising results. In this paper we have done an extensive survey on various scheduling algorithms. We are extracting the positive characteristics of each scheduling and placed it on this paper.

Download Full-text

Optimal scheduling policies in time sharing service systems

Mathematical and Computer Modelling ◽

10.1016/0895-7177(95)00202-d ◽

1995 ◽

Vol 22 (10-12) ◽

pp. 247-259 ◽

Cited By ~ 2

Author(s):

M. Ohnishi ◽

H. Maeda ◽

T. Ibaraki

Keyword(s):

Service Systems ◽

Optimal Scheduling ◽

Time Sharing ◽

Scheduling Policies

Download Full-text

A Wideband Butterfly Antenna Based on Deep Learning Parameter Optimization Algorithm

2020 Cross Strait Radio Science & Wireless Technology Conference (CSRSWTC) ◽

10.1109/csrswtc50769.2020.9372499 ◽

2020 ◽

Author(s):

Nan Wang ◽

Zhizhan Kong ◽

Xiaolong Ren ◽

Songbo Chen ◽

Guang Dai ◽

...

Keyword(s):

Deep Learning ◽

Parameter Optimization ◽

Optimization Algorithm ◽

Learning Parameter

Download Full-text

Time Sharing Algorithm with Dynamic Weighted Harmonic Round Robin

Journal of Asian Scientific Research ◽

10.18488/journal.2/2015.5.3/2.3.131.142 ◽

2015 ◽

Vol 5 (3) ◽

pp. 131-142 ◽

Cited By ~ 1

Author(s):

Mohammad Saber Iraji

Keyword(s):

Round Robin ◽

Time Sharing

Download Full-text

Fault Diagnosis of Flywheel Bearing Based on Parameter Optimization Variational Mode Decomposition Energy Entropy and Deep Learning

Energy ◽

10.1016/j.energy.2021.122108 ◽

2021 ◽

pp. 122108

Author(s):

Deqiang He ◽

Chenyu Liu ◽

Zhenzhen Jin ◽

Rui Ma ◽

Yanjun Chen ◽

...

Keyword(s):

Deep Learning ◽

Fault Diagnosis ◽

Parameter Optimization ◽

Variational Mode Decomposition ◽

Energy Entropy ◽

Mode Decomposition

Download Full-text

DyBatch: Efficient Batching and Fair Scheduling for Deep Learning Inference on Time-sharing Devices

2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID) ◽

10.1109/ccgrid49817.2020.00-32 ◽

2020 ◽

Author(s):

Shaojun Zhang ◽

Wei Li ◽

Chen Wang ◽

Zahir Tari ◽

Albert Y. Zomaya

Keyword(s):

Deep Learning ◽

Time Sharing ◽

Fair Scheduling

Download Full-text

Deep Learning Algorithm for Auto-Delineation of High-Risk Oropharyngeal Clinical Target Volumes With Built-In Dice Similarity Coefficient Parameter Optimization Function

International Journal of Radiation Oncology*Biology*Physics ◽

10.1016/j.ijrobp.2018.01.114 ◽

2018 ◽

Vol 101 (2) ◽

pp. 468-478 ◽

Cited By ~ 41

Author(s):

Carlos E. Cardenas ◽

Rachel E. McCarroll ◽

Laurence E. Court ◽

Baher A. Elgohari ◽

Hesham Elhalawani ◽

...

Keyword(s):

Deep Learning ◽

High Risk ◽

Parameter Optimization ◽

Learning Algorithm ◽

Similarity Coefficient ◽

Dice Similarity Coefficient ◽

Deep Learning Algorithm ◽

Target Volumes ◽

Optimization Function ◽

Clinical Target Volumes

Download Full-text

Space profiling for parallel functional programs

Journal of Functional Programming ◽

10.1017/s0956796810000146 ◽

2010 ◽

Vol 20 (5-6) ◽

pp. 417-461 ◽

Cited By ~ 6

Author(s):

DANIEL SPOONHOWER ◽

GUY E. BLELLOCH ◽

ROBERT HARPER ◽

PHILLIP B. GIBBONS

Keyword(s):

Resource Use ◽

Source Code ◽

Semantic Space ◽

Runtime System ◽

Scheduling Policies ◽

Use Patterns ◽

Scheduling Policy ◽

Standard Ml ◽

Cost Semantics ◽

The Impact

AbstractWe present a semantic space profiler for parallel functional programs. Building on previous work in sequential profiling, our tools help programmers to relate runtime resource use back to program source code. Unlike many profiling tools, our profiler is based on a cost semantics. This provides a means to reason about performance without requiring a detailed understanding of the compiler or runtime system. It also provides a specification for language implementers. This is critical in that it enables us to separate cleanly the performance of the application from that of the language implementation. Some aspects of the implementation can have significant effects on performance. Our cost semantics enables programmers to understand the impact of different scheduling policies while hiding many of the details of their implementations. We show applications where the choice of scheduling policy has asymptotic effects on space use. We explain these use patterns through a demonstration of our tools. We also validate our methodology by observing similar performance in our implementation of a parallel extension of Standard ML.

Download Full-text

Deep Learning Hyper Parameter Optimization for Video Analytic in Centralized System

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.a1215.109119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 7300-7305

Keyword(s):

Deep Learning ◽

Parameter Optimization ◽

Distributed Storage ◽

Parameter Tuning ◽

Video Object ◽

Video Streams ◽

Scientific Model ◽

Wide Range ◽

Centralized System

A framework to perform video examination is proposed utilizing a powerfully tuned convolutional arrange. Recordings are gotten from distributed storage, preprocessed, and a model for supporting order is created on these video streams utilizing cloud-based framework. A key spotlight in this paper is on tuning hyper-parameters related with the profound learning calculation used to build the model. We further propose a programmed video object order pipeline to approve the framework. The scientific model used to help hyper-parameter tuning improves execution of the proposed pipeline, and results of different parameters on framework's presentation is analyzed. Along these lines, the parameters that contribute toward the most ideal presentation are chosen for the video object order pipeline. Our examination based approval uncovers an exactness and accuracy of 97% and 96%, separately. The framework demonstrated to be adaptable, strong, and adjustable for a wide range of utilizations.

Download Full-text

A Round Robin Scheduling Policy for Ada

Lecture Notes in Computer Science - Reliable Software Technologies — Ada-Europe 2003 ◽

10.1007/3-540-44947-7_25 ◽

2003 ◽

pp. 334-343 ◽

Cited By ~ 7

Author(s):

A. Burns ◽

M. González Harbour ◽

A. J. Wellings

Keyword(s):

Round Robin ◽

Scheduling Policy

Download Full-text