Redwan Ibne Seraj Khan

Machine Learning and Systems Researcher, PhD Candidate

redwankhan.jpg

redwan@vt.edu

PhD Candidate, CS@VT

ML Systems Researcher

I am a PhD Candidate at CS@VT. My advisor is Dr. Ali R. Butt. I am affiliated with the Distributed Systems and Storage Lab (DSSL).

My research spans across two broad categories—Sys4ML, i.e., designing better scalable systems (computing and storage) for improving the performance and efficiency of (distributed) ML applications and ML4Sys, i.e., leveraging ML or data-driven approaches for improving systems software and resource management.

Currently I am working on the following projects that targets several ML domains—Distributed Deep Learning (DDL), Large Language Models(LLMs), and Federated Learning(FL):

(1) Building novel data sampling, caching policies, and hybrid storage systems for improving training performance of large-scale DDL workloads. (2) Developing intelligent procedures for efficient scheduling and resource utilization of DDL/LLM job training/inference. (3) Constructing novel mechanisms for client scheduling and sampling in privacy-aware high-performing FL workloads.

Before joining CS@VT, I graduated with the highest distinction (Summa Cum Laude) in Computer Engineering from University at Buffalo SUNY Buffalo in 2019.

news

Nov 2024 Provided at a lightning talk on my recent work, FedCaSe at Stack@CS Center for Computer Systems at Virginia Tech.
Nov 2024 I was featured in the blog of Chameleon Cloud, NSF’s premier cloud testbed for large-scale research! Check out the full story.
Oct 2024 I was awarded $1800 in travel grants for attending ACM SoCC 2024! Thanks Dr. Ali and CS@VT.
Oct 2024 Our paper on adaptive caching and scheduling across edge clients to improve FL performance titled “FedCaSe: Enhancing Federated Learning with Heterogeneity-aware Caching and Scheduling” has been accepted at ACM SoCC 2024!
May 2024 Excited to start my research internship at Microsoft for the summer of 2024! I’ll be building systems for running LLM workloads efficiently.
Mar 2024 Provided a guest lecture on Deep Learning Caching Systems for Big Data Systems Course at University of Virginia. Thanks Dr. Yue Cheng for inviting me!
Nov 2023 Attended Supercomputing Conference (SC’23). Thanks to VT and Dr. Ali Butt for providing the travel grants ($1000).
Apr 2023 Presented a technical talk at IBM Research. Title: Insights into Managing Machine Learning Applications Using Optimal System Resources
Apr 2023 Presented a technical talk at ByteDance. Title: Navigating the Tricky Path to Optimal Performance - Coordinating System Resources with ML Application Needs
Feb 2023 Presented our work, SHADE at USENIX FAST 2023.
Dec 2022 Our paper on building a novel caching system for Deep Learning workloads has been accepted to USENIX FAST 2023.

selected publications

2025

  1. ArXiv
    Ensuring Fair LLM Serving Amid Diverse Applications
    Redwan Ibne Seraj Khan, Kunal Jain, Haiying Shen, and 12 more authors
    In ArXiv, 2025

2024

  1. ACM SoCC 2024
    FedCaSe: Enhancing Federated Learning with Heterogeneity-aware Caching and Scheduling
    Redwan Ibne Seraj Khan, Arnab K. Paul, Xun Jian, and 2 more authors
    In ACM Symposium on Cloud Computing, Nov 2024

2023

  1. USENIX FAST’23
    SHADE: Enable Fundamental Cacheability for Distributed Deep Learning Training
    Redwan Ibne Seraj Khan, Ahmad Hossein Yazdani, Yuqi Fu, and 5 more authors
    In 21st USENIX Conference on File and Storage Technologies (FAST 23), Feb 2023

2020

  1. IEEE CLOUD’20
    On the use of containers in high performance computing environments
    Subil Abraham, Arnab K Paul, Redwan Ibne Seraj Khan, and 1 more author
    In 2020 IEEE 13th International Conference on Cloud Computing (CLOUD), Feb 2020