I'm a Ph.D. student at Shanghai Jiao Tong University, majored in Computer Science. I am a member of APEX Lab, advised by Prof. Weinan Zhang and Prof. Yong Yu. I'm working on Apache TVM and MLC-LLM, enabling deploy LLMs on diverse hardware backends, closely collaborating with Tianqi Chen.

My research interests include machine learning compiler and machine learning system.

News

2025.04: I will be joining SII as an assistant professor after graduation (likely in Summer 2025). I'm looking for self-motivated Ph.D. students who like coding and have interests in (distributed) machine learning system and AI infrastructure. Feel free to contact me via email if you are interested.

Education

Shanghai Jiao Tong University

Ph.D. in Computer Science 2020 - 2025

Ph.D. Candidates in Computer Science at School of Electronic, Information and Electrical Engineering

Research Area: Machine Learning System, Machine Learning Compiler

Shanghai Jiao Tong University

B.Sc. in Computer Science 2016 - 2020

I'm a member of ACM Honors Class, Zhiyuan Collage. Zhiyuan Collage is a collage for training outstanding students in the basic sciences, while ACM Honors Class is an elite CS program for top 5% talented students.

Publications

TensorIR: An Abstraction for Automatic Tensorized Program Optimization

ASPLOS 2023 [Paper]

Siyuan Feng*, Bohan Hou*, Hongyi Jin, Wuwei Lin, Junru Shao, Ruihang Lai, Zihao Ye, Lianmin Zheng, Cody Hao Yu, Yong Yu, Tianqi Chen

Relax: Composable Abstractions for End-to-End Dynamic Machine Learning

ASPLOS 2025 [Paper]

Ruihang Lai*, Junru Shao*, Siyuan Feng*, Steven S. Lyubomirsky*, Bohan Hou, Wuwei Lin, Zihao Ye, Hongyi Jin, Yuchen Jin, Jiawei Liu, Lesheng Jin, Yaxing Cai, Ziheng Jiang, Yong Wu, Sunghyun Park, Prakalp Srivastava, Jared Roesch, Todd C. Mowry, Tianqi Chen

Productively Deploying Emerging Models on Emerging Platforms: A Top-Down Approach for Testing and Debugging

ISSTA 2025 [Paper]

Siyuan Feng*, Jiawei Liu*, Ruihang Lai, Charlie F. Ruan, Yong Yu, Lingming Zhang, Tianqi Chen

Effectively Scheduling Computational Graphs of Deep Neural Networks toward Their Domain-Specific Accelerators

OSDI 2023 [Paper]

Jie Zhao, Siyuan Feng, Xiaoqiang Dan, Fei Liu, Chengke Wang, Sheng Yuan, Wenyuan Lv, Qikai Xie

Tensor Program Optimization with Probabilistic Programs

NeurIPS 2022 [Paper]

Junru Shao, Xiyou Zhou, Siyuan Feng, Bohan Hou, Ruihang Lai, Hongyi Jin, Wuwei Lin, Masahiro Masuda, Cody Hao Yu, and Tianqi Chen.

CityFlow: A Multi-Agent Reinforcement Learning Environment for Large Scale City Traffic Scenario

WWW 2019 [Paper]

Huichu Zhang, Siyuan Feng, Chang Liu, Yaoyao Ding, Yichen Zhu, Zihan Zhou, Weinan Zhang, Yong Yu, Haiming Jin, Zhenhui Li

Apache TVM

GitHub Repo   ...   ...

  • Open source machine learning compiler, enabling deployment models on diverse hardware backends
  • Leading TensorIR project, the next generation Tensor-level IR for tensor hardware
  • Co-leading TVM Unity/Relax project, the next generation Graph-level IR for dynamic models
  • Contributing to several key features: TVMScript, Meta-Schudule, runtime, frontend
  • Serving in Apache TVM Program Management Committee (PMC)

MLC-LLM

GitHub Repo   ...   ...

  • Compile LLMs and depoly models natively on every device
  • Supported hardware: NVIDIA GPU, AMD GPU, Apple GPU, mobile GPUs and Intel iGPU
  • Supported runtime: CUDA, ROCm, Metal, Vulkan, OpenCL, WebGPU
  • Support distributed inference for large models on CUDA and ROCm
  • Support LLM serving with OpenAI API compatibility

Web-LLM

GitHub Repo   ...   ...

  • Bringing large-language models and chat to web browsers with local GPU capabilities.
  • Leverage emerging WebGPU API to run LLMs inside the browser
  • Serving as a backend runtime of MLC-LLM