Research on Anomaly Detection in Microservice Based on Graph Neural Networks
Main Article Content
Abstract
With the development of the information and innovation industry, microservice architecture has become mainstream in software development due to its faster delivery, better scalability, and greater independence. However, microservice architecture also faces some operational challenges. The large number of services, complex dependencies, and varying resource demands make traditional log inspection or threshold configuration methods often inadequate for anomaly detection, making it difficult for operations personnel to quickly locate the root cause of failures. To address the difficulty of pinpointing fault nodes in microservice systems, this paper proposes the GraphSAGE with Attention algorithm based on graph neural networks. This algorithm combines the neighbor node sampling method of GraphSAGE with an improved attention mechanism to effectively locate anomalies in real-time using abnormal data. Compared to traditional methods, GraphSAGE with Attention does not require manual intervention and can locate anomalies by periodically collecting operational data that reflect fault characteristics. Additionally, GraphSAGE with Attention achieves better fault localization with lower resource usage while meeting real-time requirements. In this pa- per, we constructed a dataset of fault call chains using fault injection and simulated user calls on the Kubernetes microservice platform and tested the accuracy of the GraphSAGE with Attention algorithm in various anomaly and scenarios. Experimental results show that this algorithm achieves the best performance across multiple metrics. Therefore, the algorithm is of great significance for fault root location and has practical application value.