Redwan Ibne Seraj Khan

PhD Candidate, CS@VT

ML Systems Researcher

I am a PhD Candidate at CS@VT. My advisor is Dr. Ali R. Butt. I am affiliated with the Distributed Systems and Storage Lab (DSSL).

My research spans across two broad categories—Sys4ML, i.e., designing better scalable systems (computing and storage) for improving the performance and efficiency of (distributed) ML applications and ML4Sys, i.e., leveraging ML or data-driven approaches for improving systems software and resource management.

Currently I am working on the following projects that targets several ML domains—Distributed Deep Learning (DDL), Large Language Models(LLMs), and Federated Learning(FL):

(1) Building novel data sampling, caching policies, and hybrid storage systems for improving training performance of large-scale DDL workloads. (2) Developing intelligent procedures for efficient scheduling and resource utilization of DDL/LLM job training/inference. (3) Constructing novel mechanisms for client scheduling and sampling in privacy-aware high-performing FL workloads.

Before joining CS@VT, I graduated with the highest distinction (Summa Cum Laude) in Computer Engineering from University at Buffalo SUNY Buffalo in 2019.

news

Nov 2024	Provided at a lightning talk on my recent work, FedCaSe at Stack@CS Center for Computer Systems at Virginia Tech.
Nov 2024	I was featured in the blog of Chameleon Cloud, NSF’s premier cloud testbed for large-scale research! Check out the full story.
Oct 2024	I was awarded $1800 in travel grants for attending ACM SoCC 2024! Thanks Dr. Ali and CS@VT.
Oct 2024	Our paper on adaptive caching and scheduling across edge clients to improve FL performance titled “FedCaSe: Enhancing Federated Learning with Heterogeneity-aware Caching and Scheduling” has been accepted at ACM SoCC 2024!
May 2024	Excited to start my research internship at Microsoft for the summer of 2024! I’ll be building systems for running LLM workloads efficiently.
Mar 2024	Provided a guest lecture on Deep Learning Caching Systems for Big Data Systems Course at University of Virginia. Thanks Dr. Yue Cheng for inviting me!
Nov 2023	Attended Supercomputing Conference (SC’23). Thanks to VT and Dr. Ali Butt for providing the travel grants ($1000).
Apr 2023	Presented a technical talk at IBM Research. Title: Insights into Managing Machine Learning Applications Using Optimal System Resources
Apr 2023	Presented a technical talk at ByteDance. Title: Navigating the Tricky Path to Optimal Performance - Coordinating System Resources with ML Application Needs
Feb 2023	Presented our work, SHADE at USENIX FAST 2023.
Dec 2022	Our paper on building a novel caching system for Deep Learning workloads has been accepted to USENIX FAST 2023.

selected publications

2025

ArXiv

Ensuring Fair LLM Serving Amid Diverse Applications

Redwan Ibne Seraj Khan, Kunal Jain, Haiying Shen, and 12 more authors

In ArXiv, 2025

Bib PDF

@inproceedings{undersubfairserve2025,
  author = {Khan, Redwan Ibne Seraj and Jain, Kunal and Shen, Haiying and Mallick, Ankur and Parayil, Anjaly and Kulkarni, Anoop and Kofsky, Steve and Choudhary, Pankhuri and Amant, Renee St. and Wang, Rujia and Cheng, Yue and Butt, Ali R. and Ruhle, Victor and Bansal, Chetan and Rajmohan, Saravan},
  title = {Ensuring Fair LLM Serving Amid Diverse Applications},
  booktitle = {ArXiv},
  year = {2025},
  url = {https://arxiv.org/pdf/2411.15997},
}

2024

ACM SoCC 2024

FedCaSe: Enhancing Federated Learning with Heterogeneity-aware Caching and Scheduling

Redwan Ibne Seraj Khan, Arnab K. Paul, Xun Jian, and 2 more authors

In ACM Symposium on Cloud Computing, Nov 2024

Bib PDF Code

@inproceedings{khanfedcasesocc2024,
  author = {Khan, Redwan Ibne Seraj and Paul, Arnab K. and Jian, Xun and Cheng, Yue and Butt, Ali R.},
  title = {FedCaSe: Enhancing Federated Learning with Heterogeneity-aware Caching and Scheduling},
  booktitle = {ACM Symposium on Cloud Computing},
  year = {2024},
  url = {https://dl.acm.org/doi/10.1145/3698038.3698559},
  address = {Redmond, WA},
  month = nov,
  organization = {ACM},
}

2023

USENIX FAST’23

SHADE: Enable Fundamental Cacheability for Distributed Deep Learning Training

Redwan Ibne Seraj Khan, Ahmad Hossein Yazdani, Yuqi Fu, and 5 more authors

In 21st USENIX Conference on File and Storage Technologies (FAST 23), Feb 2023

Bib PDF Code Talk

@inproceedings{khanshadefast23,
  author = {Khan, Redwan Ibne Seraj and Yazdani, Ahmad Hossein and Fu, Yuqi and Paul, Arnab K. and Ji, Bo and Jian, Xun and Cheng, Yue and Butt, Ali R.},
  title = {{SHADE}: Enable Fundamental Cacheability for Distributed Deep Learning Training},
  booktitle = {21st USENIX Conference on File and Storage Technologies (FAST 23)},
  year = {2023},
  talk = {https://www.usenix.org/conference/fast23/presentation/khan},
  isbn = {978-1-939133-32-8},
  address = {Santa Clara, CA},
  pages = {135--152},
  url = {https://www.usenix.org/conference/fast23/presentation/khan},
  publisher = {USENIX Association},
  month = feb,
}

2020

IEEE CLOUD’20

On the use of containers in high performance computing environments

Subil Abraham, Arnab K Paul, Redwan Ibne Seraj Khan, and 1 more author

In 2020 IEEE 13th International Conference on Cloud Computing (CLOUD), Feb 2020

Bib PDF

@inproceedings{abraham2020use,
  title = {On the use of containers in high performance computing environments},
  author = {Abraham, Subil and Paul, Arnab K and Khan, Redwan Ibne Seraj and Butt, Ali R},
  booktitle = {2020 IEEE 13th International Conference on Cloud Computing (CLOUD)},
  pages = {284--293},
  year = {2020},
  organization = {IEEE},
}