Efficient Processing of Multi-Model Deep Learning Applications

With the development of deep learning technologies, deep neural networks are adopted in more fields of applications. Nowadays, cloud servers need to simultaneously process different computation workloads from multiple tenants. In addition, multiple models are applied together to handle complex tasks. Moreover, multi-modal deep learning methods are also widely adopted. As a result, efficient processing of multiple deep learning models become more important. As different models need accelerators with different datapath to achieve their optimal energy efficiency, conventional single model accelerators may not handle multi-model tasks very well. In this lecture, efficient processing technologies for multi-model deep learning applications will be discussed. First, the current single model accelerators are briefly discussed. Then, the current inter-model parallelization strategy, the computing engine microarchitecture design, and the model scheduling algorithms available in the literature are going to be reviewed. Finally, challenges and future trends for multi-model processing will be discussed.

Join CASS

Join CASS

Visit CASS MiLe

Join CASS

ISCAS 2024

2024 IEEE International Symposium on Circuits and Systems

2024 IEEE 22nd Interregional NEWCAS Conference

2024 IEEE International Conference on Multimedia and Expo

Efficient Processing of Multi-Model Deep Learning Applications

Seok-Bum Ko

Presentation Menu