I'm a research engineer working on AI-native video infrastructure.
I graduated summa cum laude from the
University of Maryland, College Park, where I earned B.S. degrees
in Computer Science and Mathematics. I am fortunate to be a part
of
Abhinav Shrivastava's Perception and Intelligence Lab.
I'm currently applying to computer vision PhD programs (Fall
2026). I would love to collaborate on image/video generation and
understanding projects, please
email me!
Previously, I interned at Amazon AWS EC2 Nitro,
building tools for detecting server issues across millions of
machines, and at Anello Photonics, working on
data compression pipelines and automated photonic gyroscopes
testing. I've also explored real-time ML applications in surgical
wearables and helped build autonomous drones.
My interests include computer vision, robotics, and deep learning.
Outside of research, I love reading sci-fi and fantasy novels
(favorites include Dune and The Way of Kings) or
training in Brazilian Jiu-Jitsu, Muay Thai, and MMA.
Publications
Evolutionary Caching to Accelerate Your Off-the-Shelf Diffusion Model
Anirud Aggarwal, Abhinav Shrivastava, Matthew Gwilliam
International Conference on Learning Representations (ICLR), 2025
We introduce ECAD, an evolutionary algorithm to automatically discover efficient caching schedules for
accelerating diffusion-based image generation models. ECAD achieves faster than state-of-the-art speed
and higher quality among training-free methods and generalizes across models and resolutions.
UPLiFT: Efficient Pixel-Dense Feature Upsampling with Local Attenders
Matthew Walmer, Saksham Suri, Anirud Aggarwal, Abhinav Shrivastava
arXiv, 2025
UPLiFT is a lightweight, iterative feature upsampler that converts coarse ViT and VAE features into pixel-dense representations using a fully local attention operator. It achieves state-of-the-art performance on segmentation and depth tasks while scaling linearly in tokens, and extends naturally to generative tasks for efficient image upscaling.
Projects
Side projects and unpublished research work.
Fast & Faithful: Diffusion Drift
Anirud Aggarwal*, Omkar Pathak*, Nayana Gadde*
unpublished CMSC 848R (Instructor: Sarah Wiegreffe), 2025
Do accelerated diffusion language models reason faithfully? We introduce a framework for measuring Diffusion Chain-of-Thought (DoT) faithfulness and analyze how train-free acceleration affects reasoning dynamics in LLaDA-8B and dLLM-Cache on GSM8K.
We build a real-time face-blurring system that redacts faces in live and recorded video while keeping chosen identities visible. The modular pipeline combines YuNet/SCRFD detection, SORT tracking, SFace recognition, and configurable blur options for flexible speed–accuracy tradeoffs. On crowded IRL Twitch footage, it runs faster than real time on CPU and favors privacy by accepting a few extra blur boxes rather than missing a face.
Learning to Settle: Reinforcement Learning in Catan
Anirud Aggarwal, Jinhai Yan, Serena Huang, Rohit Kommuru, Monish Napa, Han Lin
unpublished CMSC 472, 2024
We build a custom PettingZoo environment and training stack for learning to play the board game Catan with reinforcement learning. Starting from a refactored Settlers-RL codebase, we explore both multi-agent methods (via MARLlib) and a single-agent PPO baseline with dense reward shaping. Our experiments show agents that learn to play shorter, higher-scoring games, while highlighting the remaining gap to robust multi-agent performance in non-stationary, multi-player settings.