Evaluating orthogonality between application auto-tuning and run-time resource management for adaptive OpenCL applications

Abstract

The ever increasing number of processing units integrated on the same many-core chip delivers computational power that can exceed the performance requirements of a single application. The number of chips (and related power consumption) can thus be reduced to serve multiple applications — a practice which is called resource consolidation. However, this solution requires techniques to partition and assign resources among the applications and to manage unpredictable dynamic workloads. To provide the performance requirements in such scenarios, we exploit application auto-tuning, based on design-time analysis, of both application-specific dynamic knobs and computational parallelism. Such features are implemented in a software library, which is used to demonstrate the main contribution of this paper: a light-weight Run-Time Resource Management — RTRM — technique to improve resource sharing for computationally intensive OpenCL applications. We evaluate how much the interaction between RTRM and application auto-tuning can become synergistic yet orthogonal. In the proposed approach, run-time adaptation decisions are taken by each application, autonomously. This has two main advantages: i) a non-invasive application design, in terms of source code, and ii) a very low run-time overhead, since it does not require any central coordination of a supervisor nor communication between the applications. We carried out an experimental campaign by using a video processing application — an OpenCL stereo-matching implemen- tation — and stressing out resource usage. We proved that, while RTRM is necessary to provide lower variance of the application performance, the application auto-tuning layer is fundamental to trade it off with respect to the computation accuracy.

Publication
Proceedings of the International Conference on Application-Specific Systems, Architectures and Processors
Davide Gadioli
Davide Gadioli
Assistant Professor

He earned his M.S. in Information Technology (2013) and Ph.D. cum laude in 2019 from Politecnico di Milano. A former Visiting Student at IBM Research (2015), he is now a postdoctoral researcher at DEIB, focusing on application autotuning, approximate computing, molecular docking, and drug discovery. He contributes to EXSCALATE software development.

Gianluca Palermo
Gianluca Palermo
Full Professor

Gianluca Palermo received the M.Sc. degree in Electronic Engineering in 2002, and the Ph.D degree in Computer Engineering in 2006 from Politecnico di Milano. He is currently an associate professor at Department of Electronics and Information Technology in the same University. Previously he was also consultant engineer in the Low Power Design Group of AST – STMicroelectronics working on network on-chip and research assistant at the Advanced Learning and Research Institute (ALaRI) of the Università della Svizzera italiana (Switzerland). His research interests include design methodologies and architectures for embedded and HPC systems, focusing on AutoTuning aspects.

Vittorio Zaccaria
Vittorio Zaccaria
Associate Professor

I am an associate professor at Politecnico di Milano and I have worked in embedded processor architecture R&D for one of the top semiconductor companies in the world. My group is currently working on topics related to embedded systems (hardware and software), security, cryptography, operating systems.