Tail-STEAK: Improve Friend Recommendation for Tail Users via Self-Training Enhanced Knowledge Distillation

Published in Proceedings of the 38th Annual AAAI Conference on Artificial Intelligence (AAAI), 2024

Yijun Ma, Chaozhuo Li, Xiao Zhou*.

Graph neural networks (GNNs) are commonly employed in collaborative friend recommendation systems. Nevertheless, recent studies reveal a notable performance gap, particularly for users with limited connections, commonly known as tail users, in contrast to their counterparts with abundant connections (head users). Uniformly treating head and tail users poses two challenges for tail user preference learning: (C1) Label Sparsity, as tail users typically possess limited labels; and (C2) Neighborhood Sparsity, where tail users exhibit sparse observable friendships, leading to distinct preference distributions and performance degradation compared to head users. In response to these challenges, we introduce Tail-STEAK, an innovative framework that combines self-training with enhanced knowledge distillation for tail user representation learning. To address (C1), we present Tail- STEAKbase, a two-stage self-training framework. In the first stage, only head users and their accurate connections are utilized for training, while pseudo links are generated for tail users in the second stage. To tackle (C2), we propose two data augmentation-based self-knowledge distillation pretext tasks. These tasks are seamlessly integrated into different stages of the Tail-STEAK base, culminating in the comprehensive Tail-STEAK framework. Extensive experiments, conducted on state-of-the-art GNN-based friend recommendation models, substantiate the efficacy of Tail-STEAK in significantly improving tail user performance. Our code and data are publicly available at https://github.com/antman9914/Tail-STEAK.