Low-bit Quantization for Deep Graph Neural Networks with Smoothness-aware Message Propagation

Wang, S.; Eravci, B.; Guliyev, R.; Ferhatosmanoglu, H.

Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.11851/10873

Title:	Low-bit Quantization for Deep Graph Neural Networks with Smoothness-aware Message Propagation
Authors:	Wang, S. Eravci, B. Guliyev, R. Ferhatosmanoglu, H.
Keywords:	graph neural networks large-scale graph management oversmoothing in GNNs quantization scalable machine learning Backpropagation Deep neural networks Graph neural networks Message passing Multilayer neural networks Quadratic programming Graph neural networks Large-scale graph management Large-scales Message propagation Model size Number of layers Oversmoothing in GNN Quantisation Quantizers Scalable machine learning Iterative methods
Publisher:	Association for Computing Machinery
Abstract:	Graph Neural Network (GNN) training and inference involve significant challenges of scalability with respect to both model sizes and number of layers, resulting in degradation of efficiency and accuracy for large and deep GNNs. We present an end-to-end solution that aims to address these challenges for efficient GNNs in resource constrained environments while avoiding the oversmoothing problem in deep GNNs. We introduce a quantization based approach for all stages of GNNs, from message passing in training to node classification, compressing the model and enabling efficient processing. The proposed GNN quantizer learns quantization ranges and reduces the model size with comparable accuracy even under low-bit quantization. To scale with the number of layers, we devise a message propagation mechanism in training that controls layer-wise changes of similarities between neighboring nodes. This objective is incorporated into a Lagrangian function with constraints and a differential multiplier method is utilized to iteratively find optimal embeddings. This mitigates oversmoothing and suppresses the quantization error to a bound. Significant improvements are demonstrated over state-of-the-art quantization methods and deep GNN approaches in both full-precision and quantized models. The proposed quantizer demonstrates superior performance in INT2 configurations across all stages of GNN, achieving a notable level of accuracy. In contrast, existing quantization approaches fail to generate satisfactory accuracy levels. Finally, the inference with INT2 and INT4 representations exhibits a speedup of 5.11 × and 4.70 × compared to full precision counterparts, respectively. © 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM.
Description:	ACM SIGIR;ACM SIGWEB 32nd ACM International Conference on Information and Knowledge Management, CIKM 2023 -- 21 October 2023 through 25 October 2023 -- 193792
URI:	https://doi.org/10.1145/3583780.3614955 https://hdl.handle.net/20.500.11851/10873
ISBN:	9798400701245
Appears in Collections:	Scopus İndeksli Yayınlar Koleksiyonu / Scopus Indexed Publications Collection WoS İndeksli Yayınlar Koleksiyonu / WoS Indexed Publications Collection

Show full item record

CORE Recommender

Page view(s)

4

checked on Jul 22, 2024

Google Scholar^TM

Check

Page view(s)

Google ScholarTM

Altmetric

Google Scholar^TM