Zhao Guangxiang(Peking University)

Guangxiang Zhao (赵光香）

Researcher at Shanghai AI Lab since August 2022.

Before joing here, I received Doctor of Natural Science from Peking University,
and did research at Language Computing and Machine Learning Group (LANCO),
advised by Prof. Xu SUN.

GitHub
Twitter
Google Scholar
OpenReview

Email: zhaoguangxiang at pku.edu.cn / guangxiangzhao at gmail.com

Research Interests:

Machine learning methods for natural language processing.

Academic Activities:

Conference Reviewer: ICML-2023, ACL-2023, NeurIPS-2022, ICLR-2022, NAACL-2022, ACL-2022, ACL-2021
Journal Reviewer: TNNLS
Secondary Reviewer: EMNLP-2020, ACL-2019

Awards:

Excellent Graduate of Peking University, 2022 (Top 13%)
Merit Student of Peking University, 2019

Papers:

Well-classified Examples are Underestimated in Classification with Deep Neural Networks
Guangxiang Zhao, Wenkai Yang, Xuancheng Ren, Lei Li, Yunfang Wu, Xu Sun.
AAAI 2022
TL;DR: In this paper, we find that the cross-entropy loss hinders representation learning, energy optimization, and margin growth, while well-classified examples play a vital role to solving these issues. We support this finding by both theoretical analysis and empirical results.
[pdf] [code] [poster]
Model Uncertainty-Aware Knowledge Amalgamation for Pre-Trained Language Models
Lei Li, Yankai Lin, Xuancheng Ren, Guangxiang Zhao, Peng Li, Jie Zhou, Xu Sun.
Findings of EMNLP 2022
TL;DR: In this paper, we explore a novel model reuse paradigm, Knowledge Amalgamation, to merge the knowledge from different teacher-PLMs.
[pdf]
Topology-Imbalance Learning for Semi-Supervised Node Classification
Deli Chen, Yankai Lin, Guangxiang Zhao, Xuancheng Ren, Peng Li, Jie Zhou, Xu Sun.
NeurIPS 2021
TL;DR: We identify the problem of Topology-Imbalance and propose the ReNode method as the initial solution.
[pdf] [code]
Learning Relation Alignment for Calibrated Cross-modal Retrieval
Shuhuai Ren, Junyang Lin, Guangxiang Zhao, Rui Men, An Yang, Jingren Zhou, Xu Sun, Hongxia Yang.
ACL 2021
TL;DR: We propose the idea of relation alignment that aligns self-attention among two modalities.
[pdf] [code]
Layer-Wise Multi-View Decoding for Improved Natural Language Generation
Fenglin Liu*, Xuancheng Ren* (Equal Contribution), Guangxiang Zhao, Xu Sun
Preprint 2020
TL;DR: We find a limitation about information flow in Transformer and propose an effective cross-view decoding method to solve it.
[pdf]
Understanding and Improving Layer Normalization
Jingjing Xu, Xu Sun, Zhiyuan Zhang, Guangxiang Zhao, Junyang Lin.
NeurIPS 2019
TL;DR:We find that the back-prop of LayerNorm is essential. We also find that the bias and the gain in LayerNorm increase the risk of over-fitting and do not work in most cases.
[pdf] [code]
Parallel Intersected Multi-scale Attention for Sequence to Sequence Learning
Guangxiang Zhao, Xu Sun, Jingjing Xu, Zhiyuan Zhang, Liangcheng Luo.
Preprint 2019
TL;DR: We propose a simple module Prime that consistently outperforms the complicated Transformer model on main NMT datasets with SOTA performance by simply stacking this module; We also find that when combine the convolution and self-attention, their operations for learning interactions between tokens should be performed on the same features.
[pdf] [code, scripts, and pretrained models] [unified transformer]
Explicit Sparse Transformer
Guangxiang Zhao, Junyang Lin, Zhiyuan Zhang, Xuancheng Ren, Qi Su, Xu Sun.
Revising in Neurocomputing
TL;DR: We propose a sparse attention method without local dependency constraint or the need of predefined sparse attention patterns; We demonstrate that sparse attention (8 or 1/4 of the sequence length(30) in NMT) is better than regular attention. Our method enables extremely sparse attention. E.g., We further improve the sparsity of the state-of-the-art sparse attention of Adaptive Attention Span by $40\times$.
[pdf] [code] [extremely sparse transformer]
Review-Driven Multi-Label Music Style Classification by Exploiting Style Correlations
Guangxiang Zhao*, Jingjing Xu* (Equal Contribution), Qi Zeng, Xuancheng Ren, Xu Sun.
NAACL 2019
TL;DR: We build a multi-label text classification dataset (music styles are hidden in the text) with strong label correlations, propose a method that automatically learns and exploits labels correlation during training.
[pdf] [data]