Papers
2023
ADELT: Transpilation Between Deep Learning Frameworks
Linyuan Gong, Jiayi Wang, Alvin Cheung
Building Code Transpilers for Domain-Specific Languages Using Program Synthesis (Experience Paper)
Sahil Bhatia, Sumer Kohli, Sanjit A. Seshia, Alvin Cheung
A PACTful Agenda for Cloud Programming Research: (Invited Talk)
Alvin Cheung
Distributed Matrix-Based Sampling for Graph Neural Network Training
Alok Tripathy, Katherine Yelick, Aydin Buluc
Distributed-Memory Parallel Contig Generation for De Novo Long-Read Genome Assembly
Giulia Guidi, Gabriel Raulet, Daniel Rokhsar, Leonid Oliker, Katherine Yelick, Aydin Buluç
Scaling Generalized N-Body Problems, A Case Study from Genomics
Marquita Ellis, Aydin Buluc, Katherine Yelick
BELLA: Berkeley Efficient Long-Read to Long-Read Aligner and Overlapper
Giulia Guidi, Marquita Ellis, Daniel Rokhsar, Katherine Yelick, and Aydın Buluç
Fast multiplication of random dense matrices with fixed sparse matrices
Tianyu Liang, Riley Murray, Aydın Buluç, James Demmel
Surrogate-based Autotuning for Randomized Sketching Algorithms in Regression Problems
Younghyun Cho, James W. Demmel, Michał Dereziński, Haoyun Li, Hengrui Luo, Michael W. Mahoney, Riley J. Murray
Fast Exact Leverage Score Sampling from Khatri-Rao Products with Applications to Tensor Decomposition
Vivek Bharadwaj, Osman Asif Malik, Riley Murray, Laura Grigori, Aydin Buluc, James Demmel
Harnessing the Crowd for Autotuning High-Performance Computing Applications
Younghyun Cho; James W. Demmel; Jacob King; Xiaoye S. Li; Yang Liu; Hengrui Luo
Distributed-Memory Randomized Algorithms for Sparse Tensor CP Decomposition
Vivek Bharadwaj, Osman Asif Malik, Riley Murray, Aydin Buluç, James Demmel
Hybrid Models for Mixed Variables in Bayesian Optimization
Hengrui Luo, Younghyun Cho, James W. Demmel, Xiaoye S. Li, Yang Liu
GPTune: multitask learning for autotuning exascale applications
Yang Liu, Wissam M. Sid-Lakhdar, Osni Marques, Xinran Zhu, Chang Meng, James W. Demmel, Xiaoye S. Li
Memory-Efficient Hardware Performance Counters with Approximate-Counting Algorithms
Jingyi Xu, Sehoon Kim, Borivoje Nikolic
Gemmini: Enabling Systematic Deep-Learning Architecture Evaluation via Full-Stack Integration
Hasan Genc, Seah Kim, Alon Amid, Ameer Haj-Ali, Vighnesh Iyer, Pranav Prakash, Jerry Zhao, Daniel Grubb, Harrison Liew, Howard Mao, Albert J. Ou, Colin Schmidt, Samuel Steffl, John Charles Wright, Ion Stoica, Jonathan Ragan-Kelley, Krste Asanovic, Borivoje Nikolic, Yakun Sophia Shao
A 16mm2 106.1 GOPS/W Heterogeneous RISC-V Multi-Core Multi-Accelerator SoC in Low-Power 22nm FinFET
Abraham Gonzalez, Jerry Zhao, Ben Korpan, Hasan Genc, Colin Schmidt, John Charles Wright, Ayan Biswas, Alon Amid, Farhana Sheikh, Anton Sorokin, Sirisha Kale, Mani Yalamanchi, Ramya Yarlagadda, Mark Flannigan, Larry Abramowitz, Elad Alon, Yakun Sophia Shao, Krste Asanovic, Borivoje Nikolic
CoSA: Scheduling by Constrained Optimization for Spatial Accelerators
Qijing Huang, Aravind Kalaiah, Minwoo Kang, James Demmel, Grace Dinh, John Wawrzynek, Thomas Norell, Yakun Sophia Shao
Verifying RISC-V Physical Memory Protection
Kevin Cheang, Cameron Rasmussen, Dayeol Lee, David W. Kohlbrenner, Krste Asanovic, Sanjit A. Seshia
Profiling Hyperscale Big Data Processing
Abraham Gonzalez, Aasheesh Kolli, Samira Manabi Khan, Sihang Liu, Vidushi Dadu, Sagar Karandikar, Jichuan Chang, Krste Asanovic, Parthasarathy Ranganathan
Hammer: a modular and reusable physical design flow tool: invited
Harrison Liew, Daniel Grubb, John Wright, Colin Schmidt, Nayiri Krzysztofowicz, Adam M. Izraelevitz, Edward Wang, Krste Asanovic, Jonathan Bachrach, Borivoje Nikolic
4-3-an-eight-core-1-44ghz-risc-v-vector-machine-in-16nm-finfet
Colin Schmidt, John Charles Wright, Zhongkai Wang, Eric Chang, Albert J. Ou, Woo-Rham Bae, Sean Huang, Anita Flynn, Brian C. Richards, Krste Asanovic, Elad Alon
COBRA: A Framework for Evaluating Compositions of Hardware Branch Predictors
Jerry Zhao, Abraham Gonzalez, Alon Amid, Sagar Karandikar, Krste Asanovic
Simulator Independent Coverage for RTL Hardware Languages
Kevin Laeufer, Vighnesh Iyer, David Biancolin, Jonathan Bachrach
A Hardware Accelerator for Protocol Buffers
Sagar Karandikar, Chris Leary, Chris Kennelly, Jerry Zhao, Dinesh Parimi, Borivoje Nikolic, Krste Asanovic, Parthasarathy Ranganathan
An Automated and Process-Portable Generator for Phase-Locked Loop
Zhongkai Wang, Minsoo Choi, Eric Chang, John Charles Wright, Wooham Bae, Sijun Du, Zhaokai Liu, Nathan Narevsky, Colin Schmidt, Ayan Biswas, Borivoje Nikolic, Elad Alon
Accessible, FPGA Resource-Optimized Simulation of Multiclock Systems in FireSim
David Biancolin, Albert Magyar, Sagar Karandikar, Alon Amid, Borivoje Nikolic, Jonathan Bachrach, Krste Asanovic
Code Transpilation for Hardware Accelerators
Yuto Nishida, Sahil Bhatia, Shadaj Laddad, Hasan Genc, Yakun Sophia Shao, Alvin Cheung
Full Stack Optimization of Transformer Inference: a Survey
Sehoon Kim, Coleman Hooper, Thanakul Wattanawong, Minwoo Kang, Ruohan Yan, Hasan Genc, Grace Dinh, Qijing Huang, Kurt Keutzer, Michael W. Mahoney, Yakun Sophia Shao, Amir Gholami
RoSÉ: A Hardware-Software Co-Simulation Infrastructure Enabling Pre-Silicon Full-Stack Robotics SoC Evaluation
Dima Nikiforov, Shengjun Chris Dong, Chengyi Lux Zhang, Seah Kim, Borivoje Nikolic, Yakun Sophia Shao
CDPU: Co-designing Compression and Decompression Processing Units for Hyperscale Systems
Sagar Karandikar, Aniruddha N. Udipi, Junsun Choi, Joonho Whangbo, Jerry Zhao, Svilen Kanev, Edwin Lim, Jyrki Alakuijala, Vrishab Madduri, Yakun Sophia Shao, Borivoje Nikolic, Krste Asanovic, Parthasarathy Ranganathan
MoCA: Memory-Centric, Adaptive Execution for Multi-Tenant Deep Neural Networks
Seah Kim, Hasan Genc, Vadim Vadimovich Nikiforov, Krste Asanovic, Borivoje Nikolic, Yakun Sophia Shao
CoSA: Scheduling by Constrained Optimization for Spatial Accelerators
Recent advances in Deep Neural Networks (DNNs) have led to active development of specialized DNN accelerators, many of which feature a large number of processing elements laid out spatially, together with a multi-level memory hierarchy and flexible interconnect...
Vertically Integrated Computing Labs Using Open-Source Hardware Generators and Cloud-Hosted FPGAs
Alon Amid, Albert J. Ou, Krste Asanovic, Yakun Sophia Shao, Borivoje Nikolic