Publications

2024

Fairness in Serving Large Language Models.

Ying Sheng, Shiyi Cao, Dacheng Li, Banghua Zhu, Zhuohan Li, Danyang Zhuo, Joseph E. Gonzalez, Ion Stoica:
Fairness in Serving Large Language Models. CoRR abs/2401.00588 (2024)

Nebula: A Privacy-First Platform for Data Backhaul.

Jean-Luc Watson, Tess Despres, Alvin Tan, Shishir G. Patil, Prabal Dutta, Raluca Ada Popa:
Nebula: A Privacy-First Platform for Data Backhaul. IACR Cryptol. ePrint Arch. 2024: 409 (2024)

Break the Sequential Dependency of LLM Inference Using Lookahead Decoding.

Yichao Fu, Peter Bailis, Ion Stoica, Hao Zhang:
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding. CoRR abs/2402.02057 (2024)

2023

Optimizing Stateful Dataflow with Local Rewrites.

Shadaj Laddad, Conor Power, Tyler Hou, Alvin Cheung, Joseph M. Hellerstein:
Optimizing Stateful Dataflow with Local Rewrites. CoRR abs/2306.10585 (2023)

Efficient Memory Management for Large Language Model Serving with PagedAttention.

Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Ying Sheng, Lianmin Zheng, Cody Hao Yu, Joseph E. Gonzalez, Hao Zhang, Ion Stoica:
Efficient Memory Management for Large Language Model Serving with PagedAttention. CoRR abs/2309.06180 (2023)

LightSeq: Sequence Level Parallelism for Distributed Training of Long Context Transformers.

Dacheng Li, Rulin Shao, Anze Xie, Eric P. Xing, Joseph E. Gonzalez, Ion Stoica, Xuezhe Ma, Hao Zhang:
LightSeq: Sequence Level Parallelism for Distributed Training of Long Context Transformers. CoRR abs/2310.03294 (2023)

S-LoRA: Serving Thousands of Concurrent LoRA Adapters.

Ying Sheng, Shiyi Cao, Dacheng Li, Coleman Hooper, Nicholas Lee, Shuo Yang, Christopher Chou, Banghua Zhu, Lianmin Zheng, Kurt Keutzer, Joseph E. Gonzalez, Ion Stoica:
S-LoRA: Serving Thousands of Concurrent LoRA Adapters. CoRR abs/2311.03285 (2023)

RALF: Accuracy-Aware Scheduling for Feature Store Maintenance.

Sarah Wooders, Xiangxi Mo, Amit Narang, Kevin Lin, Ion Stoica, Joseph M. Hellerstein, Natacha Crooks, Joseph E. Gonzalez:
RALF: Accuracy-Aware Scheduling for Feature Store Maintenance. Proc. VLDB Endow. 17(3): 563-576 (2023)

Mammoths Are Slow: The Overlooked Transactions of Graph Data.

Audrey Cheng, Jack Waudby, Hugo Firth, Natacha Crooks, Ion Stoica:
Mammoths Are Slow: The Overlooked Transactions of Graph Data. Proc. VLDB Endow. 17(4): 904-911 (2023)

2022

Reliable Transactions in Serverless-Edge Architecture

Published: ICDE'23 (IEEE International Conference on Data Engineering)

Authors Suyash GuptaSajjad RahnamaErik LinsenmayerFaisal NawabMohammad Sadoghi

Reliable Transactions in Serverless-Edge Architecture

Modern edge applications demand novel solutions where edge applications do not have to rely on a single cloud provider (which cannot be in the vicinity of every edge device) or dedicated edge servers (which cannot scale as clouds) for processing compute-intensive tasks. A recent computing philosophy, Sky computing, proposes giving each user ability to select between available cloud providers.
In this paper, we present our serverless-edge co-design, which extends the Sky computing vision. In our serverless-edge co-design, we expect edge devices to collaborate and spawn required number of serverless functions. This raises several key challenges: (1) how will this collaboration take place, (2) what if some edge devices are compromised, and (3) what if a selected cloud provider is malicious. Hence, we design ServerlessBFT, the first protocol to guarantee Byzantine fault-tolerant (BFT) transactional flow between edge devices and serverless functions. We present an exhaustive list of attacks and their solutions on our serverless-edge co-design. Further, we extensively benchmark our architecture on a variety of parameters.

Jiffy: elastic far-memory for stateful serverless analytics

A Khandelwal, Y Tang, R Agarwal, A Akella, I Stoica Proceedings of the Seventeenth European Conference on Computer Systems (Eurosys’22)

The Sky Above The Clouds

Technology ecosystems often undergo significant transformations as they mature. For example, telephony, the Internet, and PCs all started with a single provider, but in the United States each is now served by a competitive market that uses comprehensive and universal technology standards to provide compatibility. This white paper presents our view on how the cloud ecosystem, barely over fifteen years old, could evolve as it matures.

CostCO: An automatic cost modeling framework for secure multi-party computation

Vivian Fang, Lloyd Brown, William Lin, Wenting Zheng, Aurojit Panda, Raluca Ada Popa

New Directions in Cloud Programming

11th Conference on Innovative – 11th Conference on Innovative Data Systems Research, CIDR 2021 Data Systems Research, CIDR 2021 – Cheung, A.; Crooks, N.; Hellerstein, J. M.; and Milano, M.

Serverless Boom or Bust? An Analysis of Economic Incentives

12th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud) -Charles Lin, Joseph E. Gonzalez, and Joseph M. Hellerstein.

2021

Snoopy: Surpassing the Scalability Bottleneck of Oblivious Storage

E Dauterman, V Fang, I Demertzis, N Crooks, RA Popa Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles

Basil: Breaking up BFT with ACID (transactions)

Florian Suri-Payer, Matthew Burke, Zheng Wang, Yunhao Zhang, Lorenzo Alvisi, Natacha Crooks Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles