I am a Ph.D. candidate at the Computer Systems Laboratory, Cornell University, supervised by Prof. Zhiru Zhang. I received my B.E. degree from the School of Computer Science and Engineering, Sun Yat-sen University in 2021.
My research interests broadly lie in domain-specific languages and compilers, efficient runtime systems, and accelerator architecture. In particular, I attempt to bridge the productivity and performance gap between emerging machine learning applications and heterogeneous hardware (CPU/GPU/FPGA).
Currently, I am working on compiler optimizations for (1) large-scale model training/inference in distributed environments and (2) scalable hardware accelerator design for deep learning and scientific applications. Feel free to drop me an email if you have aligned interests.
Education
Cornell University, US Ph.D. in Computer Science |
Aug. 2021 - Present | |
Thesis: Composable Programming Models for Accelerated Computing Accumulated GPA: 4.0/4.0 |
||
Cornell University, US M.S. in Computer Science |
Aug. 2021 - Oct. 2024 | |
Sun Yat-sen University, China B.E. in Computer Science |
Aug. 2017 - Jun. 2021 | |
Thesis: High-Performance Concurrent Graph Processing System (Outstanding Undergraduate Thesis) Overall GPA: 3.95/4.00 (Major GPA: 3.99/4.00) Ranking: 1/188 |
Work Experience
NVIDIA , Redmond, WA, US Research Intern, Deep Learning Compiler Technology Team Mentors: Bin Fan and Vinod Grover |
May 2024 - Nov. 2024 |
Amazon Web Services (AWS) , Santa Clara, CA, US Applied Scientist Intern, Deep Engine-Science Team Mentors: Cody Hao Yu, Shuai Zheng, and Yida Wang |
Aug. 2022 - Apr. 2023 |
ByteDance AI Lab , Beijing, China Research Intern, MLSys Team, Applied Machine Learning (AML) Mentors: Jun He and Yibo Zhu |
Aug. 2020 - May 2021 |
News
- [11/20/24] [Service] Served as an external reviewer of MLSys’25 and joined the OOPSLA’25 Artifact Evaluation Committee.
- [10/16/24] [Talk] I passed the Examination for Admission to Candidacy (A Exam) and became a PhD candidate! Thanks for all the support!
- [10/01/24] [Talk] Niansong and I will attend the annual review of the SRC JUMP 2.0 ACE Center in Chicago from Oct 1 to Oct 3 and give a presentation on Allo. See you there!
- [08/24/24] [Service] Served as a reviewer of ICLR’25.
- [08/22/24] [Talk] I gave a final presentation for my internship project on Automatic Warp Specialization for Hopper Architecture at NVIDIA. I will continue working on it as a part-time intern until November.
- [07/01/24] [Talk] I will be attending the 2024 MLSys Rising Star workshop at the NVIDIA Headquarter in Santa Clara, CA from July 15 to July 16. See you in the Bay Area!
- [06/27/24] [Award] I received 3rd place in the ACM SIGPLAN PLDI Student Research Competition (SRC).
- [06/10/24] [Talk] I will give a talk on Slapo for distributed model training at ByteDance on Jun 14. Thanks Youjie for inviting me!
- [05/16/24] [Award] I am selected as one of the ML and Systems Rising Stars! Thanks for all the support!
Publications
Allo: A Programming Model for Composable Accelerator Design
Hongzheng Chen*, Niansong Zhang*, Shaojie Xiang, Zhichen Zeng, Mengjia Dai, Zhiru Zhang
PLDI, 2024 | Blog (Zhihu)
Understanding the Potential of FPGA-Based Spatial Acceleration for Large Language Model Inference
Hongzheng Chen, Jiahao Zhang, Yixiao Du, Shaojie Xiang, Zichao Yue, Niansong Zhang, Yaohui Cai, Zhiru Zhang
ACM Transactions on Reconfigurable Technology and Systems (TRETS), 2024 (FCCM’24 Journal Track) | Blog (Zhihu)
Slapo: A Schedule Language for Progressive Optimization of Large Deep Learning Model Training
Hongzheng Chen, Cody Hao Yu, Shuai Zheng, Zhen Zhang, Zhiru Zhang, Yida Wang
ASPLOS, 2024 | Amazon Science
Formal Verification of Source-to-Source Transformations for HLS
Louis-Noël Pouchet, Emily Tucker, Niansong Zhang, Hongzheng Chen, Debjit Pal, Gabriel Rodríguez, Zhiru Zhang
FPGA, 2024 (Best Paper Award) | Cornell ECE News
BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and Preprocessing
Tianfeng Liu*, Yangrui Chen*, Dan Li, Chuan Wu, Yibo Zhu, Jun He, Yanghua Peng, Hongzheng Chen, Hongzhi Chen, Chuanxiong Guo
NSDI, 2023
Accelerator Design with Decoupled Hardware Customizations: Benefits and Challenges
Debjit Pal, Yi-Hsiang Lai, Shaojie Xiang, Niansong Zhang, Hongzheng Chen, Jeremy Casas, Pasquale Cocchini, Zhenkun Yang, Jin Yang, Louis-Noël Pouchet, Zhiru Zhang
DAC, 2022 (Invited Paper)
HeteroFlow: An Accelerator Programming Model with Decoupled Data Placement for Software-Defined FPGAs
Shaojie Xiang, Yi-Hsiang Lai, Yuan Zhou, Hongzheng Chen, Niansong Zhang, Debjit Pal, Zhiru Zhang
FPGA, 2022
Krill: A Compiler and Runtime System for Concurrent Graph Processing
Hongzheng Chen, Minghua Shen, Nong Xiao, Yutong Lu
SC, 2021
FracBNN: Accurate and FPGA-Efficient Binary Neural Networks with Fractional Activations
Yichi Zhang, Junhao Pan, Xinheng Liu, Hongzheng Chen, Deming Chen, Zhiru Zhang
FPGA, 2021 (Best Paper Nominee)
Entropy-Directed Scheduling for FPGA High-Level Synthesis
Minghua Shen, Hongzheng Chen (Corresponding author), Nong Xiao
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 2020
A Deep-Reinforcement-Learning-Based Scheduler for FPGA HLS
Hongzheng Chen, Minghua Shen
ICCAD, 2019
Workshops / Preprints
Uncovering Magic with Magic: Schedule Reconstruction from High-Performance Kernel Libraries
Hongzheng Chen
PLDI Student Research Competition (SRC), 2024 (Bronze)
Structured Pruning is All You Need for Pruning CNNs at Initialization
Yaohui Cai, Weizhe Hua, Hongzheng Chen, G. Edward Suh, Christopher De Sa, Zhiru Zhang
arXiv:2203.02549, 2022
Teaching
-
CS3110: Data Structures and Functional Programming
Spring 2022, TA, Cornell Universiy -
CS3410: Computer System Organization and Programming
Fall 2021, TA, Cornell Universiy
Professional Service
- Artifact Evaluation Committee: OOPSLA’25, OSDI’24, ATC’24, PLDI’24, OOPSLA’24, SOSP’23, OSDI’23, ATC’23, MLSys’23, PLDI’23, OSDI’22, ATC’22, PLDI’22, EuroSys’22, SOSP’21
- Journal Reviewer:
- Conference Reviewer: MLSys’25, ICLR’25, NeurIPS’24, MLSys’24, ICCAD’22
- Student Volunteer: Cornell CS PhD Application Committee’22-24, SIGPLAN-M, FPGA’24, FCCM’22
Awards & Honors
- PLDI’24 Student Research Competition (SRC) 3rd Place, SIGPLAN, 2024
- ML and Systems Rising Stars, MLCommons, 2024
- FPGA’24 Best Paper Award, FPGA, 2024
- FPGA’21 Best Paper Nominee, FPGA, 2021
- Outstanding Undergraduate Thesis Award, Sun Yat-sen University, 2021
- CCF Elite Collegiate Award (98 undergrads in China), China Computer Federation (CCF), 2020
- IEEE EDAthon 2nd Place, CEDA HK, 2019
Scholarship
- SenseTime Scholarship (21 undergrads in China), SenseTime, 2020
- Chinese National Scholarship $\times$ 2 (Top 1%), Ministry of Education of PRC, 2018-2020
- First-Prize Scholarship $\times$ 3 (Top 5%), Sun Yat-sen University, 2017-2020
- Samsung Scholarship (Top 1%), Samsung Electronics, 2017-2018
Travel Grants
- Graduate School Conference Grant, Cornell University, 2024
- SIGPLAN PLDI’24 Student Travel Grant, SIGPLAN, 2024
- IEEE FCCM’24 Student Travel Grant, FCCM, 2024
- SIGPLAN ASPLOS’24 Student Travel Grant, SIGPLAN, 2024
- Graduate School Conference Grant, Cornell University, 2023
- USENIX NSDI’23 Student Travel Grant, USENIX, 2023
Talks
- Allo: A Programming Model for Composable Accelerator Design
- ACE Annual Review, Chicago, IL, Oct 1, 2024
- PLDI'24, Copenhagen, Denmark, Jun 28, 2024
- NVIDIA DL Compiler, Redmond, WA, May 29, 2024
- FCCM'24 Demo Night, Orlando, FL, May 6, 2024
- Accelerating Large Language Model Inference on FPGA with Allo
- UW SAMPL, Seattle, WA, May 31, 2024
- UIUC AMD-Xilinx Center of Excellence (HACC), Online, Apr 10, 2024
- Understanding the Potential of FPGA-Based Spatial Acceleration for Large Language Model Inference
- FCCM'24 , Orlando, FL, May 7, 2024
- Slapo: A Schedule Language for Progressive Optimization of Large Deep Learning Model Training
- ByteDance AML, Bellevue, WA, Jun 14, 2024
- ASPLOS'24, San Diego, CA, May 1, 2024
- ACE Liaison Meeting, Online, Mar 5, 2024
- SRC TECHCON, Austin, TX, Sep 12, 2023
- ADAPT Lab @ UIUC, Online, Jul 17, 2023
- Spring ACE Center Meeting, Orlando, FL, Jun 22, 2023
- CSL Retreat @ Cornell, Ithaca, NY, May 12, 2023
- Pre-NSDI Systems Gathering @ BU, Boston, MA, Apr 16, 2023
- Amazon AI, Online, Apr 10, 2023
- TVMCon, Online, Mar 17, 2023
- An MLIR-Based Intermediate Representation for Accelerator Design with Decoupled Hardware Customizations
- CRISP Liaison Meeting, Online, Sep 28, 2022
- MLIR Open Design Meeting, Online, Aug 11, 2022
- Krill: A Compiler and Runtime System for Concurrent Graph Processing
- SC'21, St. Louis, MO, Nov 17, 2021
- A Deep-Reinforcement-Learning-Based Scheduler for FPGA HLS
- ICCAD'19, Denver, CO, Nov 5, 2019