Hanjun Kim Professor School of Electrical and Electronic Engineering, Yonsei University Ph.D. 2013, Department of Computer Science, Princeton University Office: Engineering Hall #3-C415 Phone: +82-2-2123-2770 Email: first_name at yonsei.ac.kr |
[Home] [Curriculum Vitae] [Publications] [CoreLab] [Korean] |
Refereed International Conference PublicationsOccamy: Memory-efficient GPU Compiler for DNN Inference [abstract] (IEEE Xplore, Github, PDF)
This work proposes Occamy, a new memory-efficient DNN compiler that reduces the memory usage of a DNN model without affecting its accuracy. For each DNN operation, Occamy analyzes the dimensions of input and output tensors, and their liveness within the operation. Across all the operations, Occamy analyzes liveness of all the tensors, generates a memory pool after calculating the maximum required memory size, and schedules when and where to place each tensor in the memory pool. Compared to PyTorch, on an integrated embedded GPU for six DNNs, Occamy reduces the memory usage by 34.6% and achieves a geometric mean speedup of 1.25x.his work proposes Occamy, a new memory-efficient DNN compiler that reduces the memory usage of a DNN model without affecting its accuracy. For each DNN operation, Occamy analyzes the dimensions of input and output tensors, and their liveness within the operation. Across all the operations, Occamy analyzes liveness of all the tensors, generates a memory pool after calculating the maximum required memory size, and schedules when and where to place each tensor in the memory pool. Compared to PyTorch, on an integrated embedded GPU for six DNNs, Occamy reduces the memory usage by 34.6% and achieves a geometric mean speedup of 1.25x.
|