Hong Huang /’hɒŋ ˈhwaŋ/ 黄弘

I am currently a Ph.D. candidate supervised by Prof. Dapeng Oliver Wu in City University of Hong Kong (CityU). I obtained M.S. degree from the University of Florida (UF), advised by Prof. Dapeng Oliver Wu and Prof. Ruogu Fang. I obtained B.E. degree from Shanghai Jiao Tong University (SJTU).

Research Interest

My research focuses on Efficient AI, with the overarching goal of AI democratization, making powerful AI accessible to everyone.

Efficient AI:
- Efficient LLM (Bar Menu):[Quaff ACL’25], [Tequila ICLR’26], [Sherry Preprint]
- Efficient Federated Learning (Fed- Series): [FedRTS NeurIPS’25],
  [FedMef CVPR’24],[FedTiny ICDCS’23]
- Efficient ML System (.cpp Series): [Prima.cpp ICLR’26]

I lead the FedPruning Research Group, which focuses on cutting-edge research in edge computing and model compression. The group is dedicated to guiding beginners in launching their research careers and currently comprises 15+ junior Ph.D. and M.S. students. We are looking for self-motivated students to join us (minimal requirements: familar with Deep learning and PyTorch.)

News

2026-01: Our two works, Tequila and Prima.cpp, were accepted by ICLR 2026.
2026-01: Our work, Sherry, was released on the arXiv.
2025-11: I was selected as a DAAD AINeT fellow for the Postdoc-NeT-AI 11/2025.
2025-10: I received the NeurIPS 2025 Travel Award.
2025-09: Our work, FedRTS, was accepted by NeurIPS 2025.
2025-08: I received the Research Tuition Scholarship from CityU.
2025-05: Our work, Quaff, was accepted by ACL 2025.

Selected Publications

[Preprint] Hong Huang, Decheng Wu, Qiangqiang Hu, Guanghua Yu, Jinhai Yang, Jianchen Zhu, Xue Liu, and Dapeng Wu. “Sherry: Hardware-Efficient 1.25-Bit Ternary Quantization via Fine-grained Sparsification.” arXiv preprint arXiv:2601.07892 (2026).
[ICLR ‘26] Hong Huang, Decheng Wu, Rui Cen, Guanghua Yu, Zonghang Li, Kai Liu, Jianchen Zhu, Peng Chen, Xue Liu, Dapeng Wu “Tequila: Trapping-free Ternary Quantization for Large Language Models.”
[ICLR ‘26] Zonghang Li, Tao Li, Wenjiao Feng, Rongxing Xiao, Jianshu She, Hong Huang, Mohsen Guizani, Hongfang Yu, Qirong Ho, Wei Xiang, Steve Liu “Prima.cpp: Fast 30-70B LLM Inference on Heterogeneous and Low-Resource Home Clusters.”
[ACL ‘25] Hong Huang, Dapeng Wu “Quaff: Quantized Parameter-Efficient Fine-Tuning under Outlier Spatial Stability Hypothesis.”
[NeurIPS ‘25] Hong Huang, Hai Yang, Yuan Chen, Jiaxun Ye, Dapeng Wu. “FedRTS: Federated Robust Pruning via Combinatorial Thompson Sampling.”
[CVPR ‘24] Hong Huang, Weiming Zhuang, Chen Chen, and Lingjuan Lyu. “FedMef: Towards Memory-efficient Federated Dynamic Pruning.”
[ICDCS ‘23] Hong Huang, Lan Zhang, Chaoyue Sun, Ruogu Fang, Xiaoyong Yuan, and Dapeng Wu. “Distributed Pruning Towards Tiny Neural Networks in Federated Learning.”