Adding Preemptive Scheduling Capabilities to Accelerators

Author: Shaleen Garg
Date: 2020-01-06
Report no: IIIT/TH/2020/1
Advisor:Kishore Kothapalli

Abstract

Time-sharing, which allows for multiple users to use a shared resource, is an important and fundamental aspect of modern computing systems. However, accelerators such as GPUs, that come without a native operating system do not support time-sharing. The inability of accelerators (such as GPUs) to support time-sharing limits their applicability especially as they get deployed in Platform-as-a-Service (PaaS) and Resource-as-a-Service (RaaS) environments. In the former (PaaS), elastic demands may require preemption where as in the latter (RaaS), fine-grained economic models of service cost can be supported with time-sharing. We extend the concept of time-sharing to the GPGPU computational space using cooperative multitasking approach. Our technique is applicable to any GPGPU program written in Compute Unified Device Architecture (CUDA) API provided for C/C++ programming languages. With minimal support from the programmer, our framework incorporates process scheduling, light-weight memory management, and multi-GPU support. Our framework provides an abstraction where, every workload can use a GPU(s) over a time quantum exclusively. We demonstrate the applicability of our scheduling framework, by running many workloads concurrently in a time-sharing manner. In this context, we explore two scheduling alorithms: Round-Robin(RR) and Shortest Job First(SJF). We demostrate the versatile nature of the framework by running workloads from different domains.

Full thesis: pdf

Centre for Security, Theory and Algorithms

IIIT Hyderabad Publications

Adding Preemptive Scheduling Capabilities to Accelerators

Abstract