[ISMM'25] EMD: Fair and Efficient Dynamic Memory De-bloating of Transparent Huge Pages
24th ACM SIGPLAN International Symposium on Memory Management (ISMM) 2025.
[ASPLOS'25] POD-Attention: Unlocking Full Prefill-Decode Overlap for Faster LLM Inference
30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) 2025.
[ASPLOS'25] vAttention: Dynamic Memory Management for Serving LLMs without PagedAttention
30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) 2025.
[OSDI'24] Taming Throughput-Latency Trade-off in LLM Inference with Sarathi-Serve
18th USENIX Symposium on Operating Systems Design and Implementation (OSDI) 2024.
[MLSys'24] VIDUR: A Large-Scale Simulation Framework for LLM Inference
7th Annual Conference on Machine Learning and Systems (MLSys) 2024.
[PETS'24] SIGMA: Secure GPT Inference with Function Secret Sharing
24th Privacy Enhancing Technologies Symposium (PETS) 2024.
[IEEE CAL'24] Address Scaling: Architectural Support for Fine-Grained Thread-Safe Metadata Management
IEEE Computer Architecture Letters 2024.
[Arxiv] SARATHI: Efficient LLM Inference by Piggybacking Decodes with Chunked Prefills
Arxiv 2023.
[MICRO'21] Trident: Harnessing Architectural Resources for All Page Sizes in x86 Processors
54th IEEE/ACM International Symposium on Microarchitecture (MICRO) 2021.
[PACT'21] nuKSM: NUMA-aware Memory De-duplication on Multi-socket Servers
30th International Conference on Parallel Architectures and Compilation Techniques (PACT) 2021.
[ASPLOS'21] Fast Local Page-Tables for Virtualized NUMA Servers with vMitosis
26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) 2021.
[ASPLOS'20] Mitosis: Transparently Self-Replicating Page-Tables for Large-Memory Machines
25th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) 2020.
[ASPLOS'19] HawkEye: Efficient Fine-grained OS Support for Huge Pages
24th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) 2019.
[ASPLOS'18] Making Huge Pages Actually Useful
23rd ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) 2018.
[APSys'16] A Case for Protecting Huge Pages from the Kernel
7th ACM SIGOPS Asia-Pacific Workshop on Systems (APSys) 2016.
I have served or will be serving as a reviewer on the following program committees:
- ACM Architectural Support for Programming Languages and Operating Systems (ASPLOS) 2025
- ACM Architectural Support for Programming Languages and Operating Systems (ASPLOS) 2024
- Usenix Annual Technical Conference (ATC) 2025
- Usenix Annual Technical Conference (ATC) 2024
- ACM SIGPLAN International Symposium on Memory management (ISMM) 2024
External reviewer:
- IEEE International Symposium on Performance Analysis of System and Software (ISPASS) 2025
- ACM Transactions on Computer Systems (TOCS) 2023
- ACM Transactions on Architecture and Code Optimization (TACO) 2022