DOI QR코드

DOI QR Code

Query Execution Plan Optimization Model Based on Graph Query Optimization

그래프 쿼리 최적화 기반 쿼리 실행 계획 최적화 모델

  • Qing-Quan Fan (Dept. of Computer and Media Engineering, Tongmyong University) ;
  • Kun-Hee Han (Division of Information & Communication Engineering, Baekseok University) ;
  • Seung-Soo Shin (Dept. of Information Security, Tongmyong University)
  • 범청천 (동명대학교 컴퓨터미디어공학과 ) ;
  • 한군희 (백석대학교 정보통신학부 ) ;
  • 신승수 (동명대학교 정보보호학과 )
  • Received : 2025.05.15
  • Accepted : 2025.07.20
  • Published : 2025.07.30

Abstract

Traditional SQL query optimizers rely heavily on rule-based or cost-estimation models, which often produce suboptimal execution plans under complex multi-join or nested query scenarios. To address these limitations, we propose GQO, a Graph Neural Network-based Query Optimizer that transforms execution plans into graph structures and leverages both GCN and GAT to model operator dependencies and semantic relationships. GQO generates query embeddings for accurate execution time prediction and produces optimization hints that can be injected into PostgreSQL for real-world performance gains. Experimental results on TPC-H queries demonstrate that GQO significantly outperforms traditional Cost Models, MLP baselines, and Tree-LSTM models, achieving over 26% average performance improvement.

기존 SQL 쿼리 최적화 프로그램은 규칙 기반 또는 비용 추정 모델에 크게 의존하는데, 이는 복잡한 다중 조인 또는 중첩 쿼리 시나리오에서 종종 최적이 아닌 실행 계획을 생성한다. 이러한 한계를 해결하기 위해, 실행 계획을 그래프 구조로 변환하고 GCN과 GAT를 활용하여 연산자 종속성과 의미 관계를 모델링하는 그래프 신경망 기반 쿼리 최적화 프로그램인 GQO를 제안한다. GQO는 정확한 실행 시간 예측을 위한 쿼리 임베딩을 생성하고, PostgreSQL에 삽입하여 실제 성능 향상을 위해 사용할 수 있는 최적화 힌트를 생성한다. TPC-H 쿼리에 대한 실험 결과는 GQO가 기존 비용 모델, MLP 베이스라인, 그리고 Tree-LSTM 모델보다 훨씬 뛰어난 성능을 보이며 평균 26% 이상의 성능이 향상되었다.

Keywords

References

  1. Bogin, B., Gardner, M. & Berant, J. (2019). Global reasoning over database structures for text-to-SQL parsing. arXiv preprint arXiv:1908.11214. DOI : 10.48550/arXiv.1908.11214
  2. Chen, T., Gao, J., Chen, H. & Tu, Y. (2023). LOGER: A learned optimizer towards generating efficient and robust query execution plans. Proceedings of the VLDB Endowment, 16(7), 1777-1789. DOI : 10.14778/3587136.3587150
  3. Zhang, J., Zhang, C., Li, G. & Chai, C. (2023). Autoce: An accurate and efficient model advisor for learned cardinality estimation. IEEE 39th International Conference on Data Engineering (ICDE), 2621-2633. DOI : 10.1109/ICDE55515.2023.00201
  4. Li, Z., Li, Y., Luo, Y., Li, G. & Zhang, C. (2025). Graph neural networks for databases: A survey. arXiv preprint arXiv:2502.12908. DOI : 10.48550/arXiv.2502.12908
  5. Kraska, T., Beutel, A., Chi, E. H., Dean, J. & Polyzotis, N. (2018). The case for learned index structures. Proceedings of the 2018 International Conference on Management of Data, 489-504. DOI : 10.1145/3183713.3196909
  6. Nathan, V., Ding, J., Alizadeh, M. & Kraska, T. (2020). Learning multi-dimensional indexes. Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, 985-1000. DOI : 10.1145/3318464.3380579
  7. Marcus, R., Negi, P., Mao, H., Zhang, C., Alizadeh, M., Kraska, T., Papaemmanouil, O. & Tatbul, N. (2019). Neo: A learned query optimizer. arXiv preprint arXiv:1904.03711. DOI : 10.14778/3342263.3342644
  8. Wang, B., Shin, R., Liu, X., Polozov, O. & Richardson, M. (2019). Rat-sql: Relation-aware schema encoding and linking for text-to-sql parsers. arXiv preprint arXiv:1911.04942. DOI : 10.48550/arXiv.1911.04942
  9. Milicevic, B. & Babovic, Z. (2024). A systematic review of deep learning applications in database query execution. Journal of Big Data, 11, 173. https://doi.org/10.1186/s40537-024-01025-1
  10. He, Z., Yu, J., Gu, T. & Yang, D. (2024). Query execution time estimation in graph databases based on graph neural networks. Journal of King Saud University-Computer and Information Sciences, 36(4). DOI : 10.1016/j.jksuci.2024.102018
  11. Qi, K., Yu, J. & He, Z. (2023). A cardinality estimator in complex database systems based on TreeLSTM. Sensors, 23(17). DOI : 10.3390/s23177364
  12. Corso, G., Stark, H., Jegelka, S., Jaakkola, T. & Barzilay, R. (2024). Graph neural networks. Nature Reviews Methods Primers, 4, 17. https://doi.org/10.1038/s43586-024-00294-7
  13. Veličković, P., Cucurull, G., Casanova, A., Romero, A., Liò, P. & Bengio, Y. (2017). Graph attention network. arXiv preprint arXiv: 1710.10903. DOI : 10.48550/arXiv.1710.10903
  14. Kruse, R., Mostaghim, S., Borgelt, C., Braune, C. & Steinbrecher, M. (2022). Multi-layer perceptrons. Computational Intelligence: A Methodological Introduction. Springer International Publishing, 53-124.
  15. Xu, J., Li, Z., Du, B., Zhang, M. & Liu, J. (2020). Reluplex made more practical: Leaky ReLU. 2020 IEEE Symposium on Computers and Communications (ISCC). DOI : 10.1109/ISCC50000.2020.9219587
  16. Hodson, T. O. (2022). Root mean square error (RMSE) or mean absolute error (MAE): When to use them or not. Geoscientific Model Development Discussions, 15(14). DOI : 10.5194/gmd-15-5481-2022
  17. Vershinin, I. S. & Mustafina, A. R. (2021). Performance analysis of PostgreSQL, MySQL, Microsoft SQL Server systems based on TPC-H tests. 2021 International Russian Automation Conference (RusAutoCon), 683-687. DOI : 10.1109/RusAutoCon52004.2021.9537400