Hong Huang /’hɒŋ ˈhwaŋ/ 黄弘

I am currently a Ph.D. candidate supervised by Prof. Dapeng Oliver Wu in City University of Hong Kong (CityU). I obtained M.S. degree from the University of Florida (UF), advised by Prof. Dapeng Oliver Wu and Prof. Ruogu Fang. I obtained B.E. degree from Shanghai Jiao Tong University (SJTU).

Research Interest

My research focuses on Efficient AI, with the overarching goal of AI democratization, making powerful AI accessible to everyone.

I lead the FedPruning Research Group, which focuses on cutting-edge research in edge computing and model compression. The group is dedicated to guiding beginners in launching their research careers and currently comprises 15+ junior Ph.D. and M.S. students. We are looking for self-motivated students to join us (minimal requirements: familar with Deep learning and PyTorch.)

News

  • 2026-01: Our two works, Tequila and Prima.cpp, were accepted by ICLR 2026.
  • 2026-01: Our work, Sherry, was released on the arXiv.
  • 2025-11: I was selected as a DAAD AINeT fellow for the Postdoc-NeT-AI 11/2025.
  • 2025-10: I received the NeurIPS 2025 Travel Award.
  • 2025-09: Our work, FedRTS, was accepted by NeurIPS 2025.
  • 2025-08: I received the Research Tuition Scholarship from CityU.
  • 2025-05: Our work, Quaff, was accepted by ACL 2025.

Selected Publications

  • [Preprint] Hong Huang, Decheng Wu, Qiangqiang Hu, Guanghua Yu, Jinhai Yang, Jianchen Zhu, Xue Liu, and Dapeng Wu. “Sherry: Hardware-Efficient 1.25-Bit Ternary Quantization via Fine-grained Sparsification.” arXiv preprint arXiv:2601.07892 (2026).
  • [ICLR ‘26] Hong Huang, Decheng Wu, Rui Cen, Guanghua Yu, Zonghang Li, Kai Liu, Jianchen Zhu, Peng Chen, Xue Liu, Dapeng Wu “Tequila: Trapping-free Ternary Quantization for Large Language Models.”
  • [ICLR ‘26] Zonghang Li, Tao Li, Wenjiao Feng, Rongxing Xiao, Jianshu She, Hong Huang, Mohsen Guizani, Hongfang Yu, Qirong Ho, Wei Xiang, Steve Liu “Prima.cpp: Fast 30-70B LLM Inference on Heterogeneous and Low-Resource Home Clusters.”
  • [ACL ‘25] Hong Huang, Dapeng Wu “Quaff: Quantized Parameter-Efficient Fine-Tuning under Outlier Spatial Stability Hypothesis.”
  • [NeurIPS ‘25] Hong Huang, Hai Yang, Yuan Chen, Jiaxun Ye, Dapeng Wu. “FedRTS: Federated Robust Pruning via Combinatorial Thompson Sampling.”
  • [CVPR ‘24] Hong Huang, Weiming Zhuang, Chen Chen, and Lingjuan Lyu. “FedMef: Towards Memory-efficient Federated Dynamic Pruning.”
  • [ICDCS ‘23] Hong Huang, Lan Zhang, Chaoyue Sun, Ruogu Fang, Xiaoyong Yuan, and Dapeng Wu. “Distributed Pruning Towards Tiny Neural Networks in Federated Learning.”