2026
CHAI: CacHe Attention Inference for text2video
Joel Mathew Cherian,
Ashutosh Muralidhara Bharadwaj,
Vima Gupta,
Anand Iyer
arXiv
2026
OMEGA: A Low-Latency GNN Serving System for Large Graphs
Geon-Woo Kim,
Donghyun Kim,
Jeongyoon Moon,
Henry Liu,
Tarannum Khan,
Anand Iyer,
Daehyeok Kim,
Aditya Akella
IEEE IPDPS
2026
SAFuzz: Semantic-Guided Adaptive Fuzzing for LLM-Generated Code
Ziyi Yang,
Kalit Inani,
Keshav Kabra,
Vima Gupta,
Anand Iyer
arXiv
2026
VTC: DNN Compilation with Virtual Tensors for Data Movement Elimination
Muyan Hu,
Ahan Gupta,
Jiachen Yuan,
Vima Gupta,
Taeksang Kim,
Xin Xu,
Janardhan Kulkarni,
Ofer Dekel,
Vikram Adve,
Charith Mendis
USENIX OSDI
2026
2025
Aragog: Just-in-Time Model Routing for Scalable Serving of Agentic Workflows
Yinwei Dai,
Zhuofu Chen,
Anand Iyer,
Ravi Netravali
arXiv
2025
FLEX: Fast, Accurate DNN Inference on Low-Cost Edges Using Heterogeneous Accelerator Execution
Tanmoy Sen,
Haiying Shen,
Anand Iyer
ACM EuroSys
2025
Heterogeneous Graph Neural Network on Semantic Tree
Mingyu Guan,
Jack W. Stokes,
Qinlong Luo,
Fuchen Liu,
Purvanshi Mehta,
Elnaz Nouri,
Taesoo Kim
ReInc: Scaling Training of Dynamic Graph Neural Networks
Mingyu Guan,
Saumia Singhal,
Taesoo Kim,
Anand Iyer
arXiv
2025
Principles and Methodologies for Serial Performance Optimization
Sujin Park,
Mingyu Guan,
Xiang Cheng,
Taesoo Kim
USENIX OSDI
2025
2024
Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving
Yinwei Dai,
Rui Pan,
Anand Iyer,
Kai Li,
Ravi Netravali
ACM SOSP
2024
Improving DNN Inference Throughput Using Practical, Per-Input Compute Adaptation
Anand Iyer,
Mingyu Guan,
Yinwei Dai,
Rui Pan,
Swapnil Gandhi,
Ravi Netravali
ACM SOSP
2024
Lynx: Enabling Efficient MoE Inference through Dynamic Batch-Aware Expert Selection
Vima Gupta,
Kartik Sinha,
Ada Gavrilovska,
Anand Iyer
arXiv
2024
USHER: Holistic Interference Avoidance for Resource Optimized ML Inference
Sudipta Saha Shubha,
Haiying Shen,
Anand Iyer
USENIX OSDI
2024
Vulcan: Automatic Query Planning for Live ML Analytics
Yiwen Zhang,
Xumiao Zhang,
Ganesh Ananthanarayanan,
Anand Iyer,
Yuanchao Shu,
Victor Bahl,
Z. Morley Mao,
Mosharaf Chowdhury
USENIX NSDI
2024
2023
Gemel: Model Merging for Memory-Efficient, Real-Time Video Analytics at the Edge
Arthi Padmanabhan,
Neil Agarwal,
Anand Iyer,
Ganesh Ananthanarayanan,
Yuanchao Shu,
Nikolaos Karianakis,
Guoqing Harry Xu,
Ravi Netravali
USENIX NSDI
2023