Peak-Memory-aware Partitioning and Scheduling for Multi-tenant DNN Model Inference [abstract] (ScienceDirect) Jaeho Lee, Ju Min Lee, Haeeun Jeong, Hyunho Kwon, Youngsok Kim, Yongjun Park, and Hanjun Kim To Appear: Journal of Systems Architecture, March 2026.
IF=4.1, Q1 (JCR 2025)