GraphSHA: Synthesizing Harder Samples for Class-Imbalanced Node Classification
1CSE, SYSU
|
2AI Thrust, HKUST (GZ)
|
3CSE, HKUST
|
Abstract
|
Class imbalance is the phenomenon that some classes have much fewer instances than others,
which is ubiquitous in real-world graph-structured scenarios.
Recent studies find that off-the-shelf Graph Neural Networks (GNNs) would under-represent minor class samples.
We investigate this phenomenon and discover that the subspaces of minor classes being squeezed by those of the major
ones in the latent space is the main cause of this failure.
We are naturally inspired to enlarge the decision boundaries of minor classes and propose a general framework
GraphSHA by Synthesizing HArder minor samples.
Furthermore, to avoid the enlarged minor boundary violating the subspaces of neighbor classes,
we also propose a module called SemiMixup to transmit enlarged boundary information to the interior of the minor
classes while blocking information propagation from minor classes to neighbor classes.
Empirically, GraphSHA shows its effectiveness in enlarging the decision boundaries of minor classes,
as it outperforms various baseline methods in class-imbalanced node classification with different GNN backbone
encoders over seven public benchmark datasets.
KDD 2023 Presentation
Highlights
(1) Squeezed minority problem
(2) Node classification in long-tailed setting
(3) Node classification in step setting
(4) Node classification on large scale natrually imbalanced dataset
(5) Ablation study
(6) Influence of imbalance ratio
(7) How GraphSHA tackles the squeezed minority problem
Citation
@inproceedings{li2023graphsha,
title={Graphsha: Synthesizing harder samples for class-imbalanced node classification},
author={Li, Wen-Zhi and Wang, Chang-Dong and Xiong, Hui and Lai, Jian-Huang},
booktitle={Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining},
pages={1328--1340},
year={2023}
}