Publications

2024

Fairness in Serving Large Language Models.

Ying Sheng, Shiyi Cao, Dacheng Li, Banghua Zhu, Zhuohan Li, Danyang Zhuo, Joseph E. Gonzalez, Ion Stoica:
Fairness in Serving Large Language Models. CoRR abs/2401.00588 (2024)text to speech

Break the Sequential Dependency of LLM Inference Using Lookahead Decoding.

Yichao Fu, Peter Bailis, Ion Stoica, Hao Zhang:
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding. CoRR abs/2402.02057 (2024)text to speech

2023

SkyPilot: An Intercloud Broker for Sky Computing.

Zongheng Yang, Zhanghao Wu, Michael Luo, Wei-Lin Chiang, Romil Bhardwaj, Woosuk Kwon, Siyuan Zhuang, Frank Sifei Luan, Gautam Mittal, Scott Shenker, Ion Stoica:
SkyPilot: An Intercloud Broker for Sky Computing. NSDI 2023: 437-455text to speech

AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving.

Zhuohan Li, Lianmin Zheng, Yinmin Zhong, Vincent Liu, Ying Sheng, Xin Jin, Yanping Huang, Zhifeng Chen, Hao Zhang, Joseph E. Gonzalez, Ion Stoica:
AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving. OSDI 2023: 663-679text to speech

Invited Paper: Initial Steps Toward a Compiler for Distributed Programs.

Joseph M. Hellerstein, Shadaj Laddad, Mae Milano, Conor Power, Mingwei Samuel:
Invited Paper: Initial Steps Toward a Compiler for Distributed Programs. ApPLIED@PODC 2023: 5:1-5:10text to speech

LLM-Assisted Code Cleaning For Training Accurate Code Generators.

Naman Jain, Tianjun Zhang, Wei-Lin Chiang, Joseph E. Gonzalez, Koushik Sen, Ion Stoica:
LLM-Assisted Code Cleaning For Training Accurate Code Generators. CoRR abs/2311.14904 (2023)text to speech

S-LoRA: Serving Thousands of Concurrent LoRA Adapters.

Ying Sheng, Shiyi Cao, Dacheng Li, Coleman Hooper, Nicholas Lee, Shuo Yang, Christopher Chou, Banghua Zhu, Lianmin Zheng, Kurt Keutzer, Joseph E. Gonzalez, Ion Stoica:
S-LoRA: Serving Thousands of Concurrent LoRA Adapters. CoRR abs/2311.03285 (2023)text to speech

Efficiently Programming Large Language Models using SGLang.

Lianmin Zheng, Liangsheng Yin, Zhiqiang Xie, Jeff Huang, Chuyue Sun, Cody Hao Yu, Shiyi Cao, Christos Kozyrakis, Ion Stoica, Joseph E. Gonzalez, Clark W. Barrett, Ying Sheng:
Efficiently Programming Large Language Models using SGLang. CoRR abs/2312.07104 (2023)text to speech

CodeScholar: Growing Idiomatic Code Examples.

Manish Shetty, Koushik Sen, Ion Stoica:
CodeScholar: Growing Idiomatic Code Examples. CoRR abs/2312.15157 (2023)text to speech

Multiversion Hindsight Logging for Continuous Training.

Rolando Garcia, Anusha Dandamudi, Gabriel Matute, Lehan Wan, Joseph Gonzalez, Joseph M. Hellerstein, Koushik Sen:
Multiversion Hindsight Logging for Continuous Training. CoRR abs/2310.07898 (2023)text to speech

2022

Reliable Transactions in Serverless-Edge Architecture

Published: ICDE'23 (IEEE International Conference on Data Engineering)

Authors Suyash GuptaSajjad RahnamaErik LinsenmayerFaisal NawabMohammad Sadoghi

Reliable Transactions in Serverless-Edge Architecture

Modern edge applications demand novel solutions where edge applications do not have to rely on a single cloud provider (which cannot be in the vicinity of every edge device) or dedicated edge servers (which cannot scale as clouds) for processing compute-intensive tasks. A recent computing philosophy, Sky computing, proposes giving each user ability to select between available cloud providers.
In this paper, we present our serverless-edge co-design, which extends the Sky computing vision. In our serverless-edge co-design, we expect edge devices to collaborate and spawn required number of serverless functions. This raises several key challenges: (1) how will this collaboration take place, (2) what if some edge devices are compromised, and (3) what if a selected cloud provider is malicious. Hence, we design ServerlessBFT, the first protocol to guarantee Byzantine fault-tolerant (BFT) transactional flow between edge devices and serverless functions. We present an exhaustive list of attacks and their solutions on our serverless-edge co-design. Further, we extensively benchmark our architecture on a variety of parameters.

Jiffy: elastic far-memory for stateful serverless analytics

A Khandelwal, Y Tang, R Agarwal, A Akella, I Stoica Proceedings of the Seventeenth European Conference on Computer Systems (Eurosys’22)

The Sky Above The Clouds

Technology ecosystems often undergo significant transformations as they mature. For example, telephony, the Internet, and PCs all started with a single provider, but in the United States each is now served by a competitive market that uses comprehensive and universal technology standards to provide compatibility. This white paper presents our view on how the cloud ecosystem, barely over fifteen years old, could evolve as it matures.

CostCO: An automatic cost modeling framework for secure multi-party computation

Vivian Fang, Lloyd Brown, William Lin, Wenting Zheng, Aurojit Panda, Raluca Ada Popa

New Directions in Cloud Programming

11th Conference on Innovative – 11th Conference on Innovative Data Systems Research, CIDR 2021 Data Systems Research, CIDR 2021 – Cheung, A.; Crooks, N.; Hellerstein, J. M.; and Milano, M.

Serverless Boom or Bust? An Analysis of Economic Incentives

12th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud) -Charles Lin, Joseph E. Gonzalez, and Joseph M. Hellerstein.

2021

Snoopy: Surpassing the Scalability Bottleneck of Oblivious Storage

E Dauterman, V Fang, I Demertzis, N Crooks, RA Popa Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles

Basil: Breaking up BFT with ACID (transactions)

Florian Suri-Payer, Matthew Burke, Zheng Wang, Yunhao Zhang, Lorenzo Alvisi, Natacha Crooks Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles