2025
FLEX: Fast, Accurate DNN Inference on Low-Cost Edges Using Heterogeneous Accelerator Execution
Tanmoy Sen,
Haiying Shen,
Anand Iyer
ACM EuroSys
2025
Heterogeneous Graph Neural Network on Semantic Tree
Mingyu Guan,
Jack W. Stokes,
Qinlong Luo,
Fuchen Liu,
Purvanshi Mehta,
Elnaz Nouri,
Taesoo Kim
ReInc: Scaling Training of Dynamic Graph Neural Networks
Mingyu Guan,
Saumia Singhal,
Taesoo Kim,
Anand Iyer
arXiv
2025
Principles and Methodologies for Serial Performance Optimization
Sujin Park,
Mingyu Guan,
Xiang Cheng,
Taesoo Kim
USENIX OSDI
2025
2024
Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving
Yinwei Dai,
Rui Pan,
Anand Iyer,
Kai Li,
Ravi Netravali
ACM SOSP
2024
Improving DNN Inference Throughput Using Practical, Per-Input Compute Adaptation
Anand Iyer,
Mingyu Guan,
Yinwei Dai,
Rui Pan,
Swapnil Gandhi,
Ravi Netravali
ACM SOSP
2024
Lynx: Enabling Efficient MoE Inference through Dynamic Batch-Aware Expert Selection
Vima Gupta,
Kartik Sinha,
Ada Gavrilovska,
Anand Iyer
arXiv
2024
USHER: Holistic Interference Avoidance for Resource Optimized ML Inference
Sudipta Saha Shubha,
Haiying Shen,
Anand Iyer
USENIX OSDI
2024
Vulcan: Automatic Query Planning for Live ML Analytics
Yiwen Zhang,
Xumiao Zhang,
Ganesh Ananthanarayanan,
Anand Iyer,
Yuanchao Shu,
Victor Bahl,
Z. Morley Mao,
Mosharaf Chowdhury
USENIX NSDI
2024
2023
Gemel: Model Merging for Memory-Efficient, Real-Time Video Analytics at the Edge
Arthi Padmanabhan,
Neil Agarwal,
Anand Iyer,
Ganesh Ananthanarayanan,
Yuanchao Shu,
Nikolaos Karianakis,
Guoqing Harry Xu,
Ravi Netravali
USENIX NSDI
2023