SMaLL: a Framework for Rapidly Generating ML Libraries

Upasana Sridhar, PhD student, Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh

21 November 2023

Talk summary: There is much interest in deploying deep neural networks (DNNs) on edge devices such as microcontrollers, Raspberry Pis, and smartphones. However, embedded devices are often resource constrained, making high-performance machine learning (ML) libraries critical to enable DNN deployment. These high-performance libraries are typically hand-tuned for very specific hardware features and are difficult to port across even generations of the same hardware architecture. The rapid development of new edge devices, combined with the high implementation effort required by high performance libraries, leads to sparse support for the high-performance libraries required.

The SMaLL framework is an open-source solution to rapidly port high-performance machine learning libraries to new CPU (central processing unit) architectures. The key insight is that the operations used in DNNs can be expressed using a shared abstraction. This allows performance-specific optimisations to be isolated to a small section of code, called a kernel, with support for a new architecture requiring only a few hundred lines of new code. Further, the resulting libraries frequently have better performance than the state-of-the-art ML framework on each target hardware.

In this talk, Upasana Sridhar focussed on the specific problems of developing libraries for edge devices and highlighted the lessons learnt from constructing abstractions for performance.

Speaker bio:  Upasana Sridhar is a PhD student in her fifth year, studying in the Department of Electrical and Computer Engineering at Carnegie Mellon University. Her area of expertise lies in enhancing the accessibility of high-performance code, with a specific focus on the intersection of machine learning, graph algorithms, and related topics. She is actively engaged in developing open-source frameworks to expedite library development and devising methods for evaluating the performance implications of various design choices.

[Talk organised in collaboration with the Department of Computer Science and Automation]