Performance Analysis and Identifying Characteristics of Processing-in-Memory System with Polyhedral Benchmark Suite

Jeonggeun Kim;

Journal of the Semiconductor & Display Technology (반도체디스플레이기술학회지)

Volume 22 Issue 3
/
Pages.142-148
/
2023
/
1738-2270(pISSN)

The Korean Society Of Semiconductor & Display Technology (한국반도체디스플레이기술학회)

Performance Analysis and Identifying Characteristics of Processing-in-Memory System with Polyhedral Benchmark Suite

프로세싱 인 메모리 시스템에서의 PolyBench 구동에 대한 동작 성능 및 특성 분석과 고찰

Jeonggeun Kim (School of Computer Science and Engineering, College of IT Engineering, Kyungpook National University)

김정근 (경북대학교 IT대학 컴퓨터학부)

Received : 2023.09.10
Accepted : 2023.09.18
Published : 2023.09.30

PDF

Download PDF

⟨ Previous Next ⟩

Abstract

In this paper, we identify performance issues in executing compute kernels from PolyBench, which includes compute kernels that are the core computational units of various data-intensive workloads, such as deep learning and data-intensive applications, on Processing-in-Memory (PIM) devices. Therefore, using our in-house simulator, we measured and compared the various performance metrics of workloads based on traditional out-of-order and in-order processors with Processing-in-Memory-based systems. As a result, the PIM-based system improves performance compared to other computing models due to the short-term data reuse characteristic of computational kernels from PolyBench. However, some kernels perform poorly in PIM-based systems without a multi-layer cache hierarchy due to some kernel's long-term data reuse characteristics. Hence, our evaluation and analysis results suggest that further research should consider dynamic and workload pattern adaptive approaches to overcome performance degradation from computational kernels with long-term data reuse characteristics and hidden data locality.

Keywords

Acknowledgement

This research was supported by National Research Foundation of Korea (NRF) Grant funded by the Korean Government (Ministry of Education) NRF-2021R1I1A 1A01059737 and the MSIT (Ministry of Science and ICT), Korea, under the Innovative Human Resource Development for Local Intellectualization support program (IITP2023-RS-2022-00156389) supervised by the IITP (Institute for Information & communications Technology Planning & Evaluation).

References

Ghose, S., Boroumand, A., Kim, J. S., Gomez-Luna, J., and Mutlu, O., "Processing-in-memory: A workloaddriven perspective," IBM Journal of Research and Development, Vol. 63(6), pp. 3-1, 2019.
Qureshi, M., "With new memories come new challenges," IEEE Micro, Vol. 39(1), pp. 52-53, 2019. https://doi.org/10.1109/MM.2019.2892195
Mutlu, O., Ghose, S., Gomez-Luna, J., and Ausavarungnirun, R., "Processing data where it makes sense: Enabling in-memory computation," Microprocessors and Microsystems, Vol. 67, pp. 28-41, 2019. https://doi.org/10.1016/j.micpro.2019.01.009
Singh, G., et al., "Near-memory computing: Past, present, and future," Microprocessors and Microsystems, Vol. 71, pp. 102868, 2019.
Pouchet, LN., "Polybench: The polyhedral benchmark suite," Retrieved from: https://web.cs.ucla.edu/~pouchet/software/polybench/
Grauer-Gray, S., Xu, L., Searles, R., Ayalasomayajula, S., and Cavazos, J., "Auto-tuning a high-level language targeted to GPU codes," In 2012 innovative parallel computing (InPar), IEEE, pp. 1-10, 2012.
Yuki, T., "Understanding polybench/c 3.2 kernels," International workshop on polyhedral compilation techniques (IMPACT), 2014.
Wei, Y., Zhou, M., Liu, S., Seemakhupt, K., Rosing, T., and Khan, S., "PIMProf: an automated program profiler for processing-in-memory offloading decisions," In 2022 Design, Automation & Test in Europe Conference & Exhibition (DATE), IEEE, pp. 855-860, 2022.
Ghiasi, N. M., et al., "ALP: Alleviating CPU-Memory Data Movement Overheads in Memory-Centric Systems," IEEE Transactions on Emerging Topics in Computing, 2022.
Abella-Gonzalez, M. A., Carollo-Fernandez, P., Pouchet, L. N., Rastello, F., and Rodriguez, G., "PolyBench/ Python: benchmarking Python environments with polyhedral optimizations," In Proceedings of the 30th ACM SIGPLAN International Conference on Compiler Construction, pp. 59-70, 2021.
Henning, J. L., "SPEC CPU2006 benchmark descriptions," ACM SIGARCH Computer Architecture News, Vol. 34(4), pp. 1-17, 2006. https://doi.org/10.1145/1186736.1186737
Bucek, J., Lange, K. D., and v. Kistowski, J., "SPEC CPU2017: Next-generation compute benchmark," In Companion of the 2018 ACM/SPEC International Conference on Performance Engineering, pp. 41-42, 2018.
SPEC CPU® 2017, Retrieved from: https://www.spec.org/cpu2017/
PolyBench/ACC, Retrieved from: https://cavazos-lab.github.io/PolyBench-ACC/
Luk, C. K., et al., "Pin: building customized program analysis tools with dynamic instrumentation," Acm sigplan notices, Vol. 40(6), pp. 190-200, 2005. https://doi.org/10.1145/1064978.1065034
Intel® Pin. Retrieved from: https://www.intel.com/content/www/us/en/developer/articles/tool/pin-a-dynamicbinary-instrumentation-tool.html
Ke, Liu, et al. "Near-memory processing in action: Accelerating personalized recommendation with axdimm," IEEE Micro, Vol. 42(1), pp. 116-127, 2021. https://doi.org/10.1109/MM.2021.3097700
Gomez-Luna, J., El Hajj, I., Fernandez, I., Giannoula, C., Oliveira, G. F., and Mutlu, O., "Benchmarking a new paradigm: Experimental analysis and characterization of a real processing-in-memory system," IEEE Access, Vol. 10, pp. 52565-52608, 2022. https://doi.org/10.1109/ACCESS.2022.3174101
UPMEM. Retrieved from: https://www.upmem.com/
Hassan, M., Park, C. H., and Black-Schaffer, D., "A reusable characterization of the memory system behavior of spec2017 and spec2006," ACM Transactions on Architecture and Code Optimization (TACO), Vol. 18(2), pp. 1-20, 2021. https://doi.org/10.1145/3446200
Gober, N., et al., "The championship simulator: Architectural simulation for education and competition," arXiv preprint arXiv:2210.14324, 2022.
ChampSim. Retrieved from: https://github.com/Champ Sim/ChampSim
Kim, Y., Yang, W., and Mutlu, O., "Ramulator: A fast and extensible DRAM simulator," IEEE Computer architecture letters, Vol. 15(1), pp. 45-49, 2015. https://doi.org/10.1109/LCA.2015.2414456
Chatterjee, N., et al., "Usimm: the utah simulated memory module," University of Utah, Tech. Rep, pp. 1-24, 2012.
Choi, J. H., "Lifetime Extension Method for Non-Volatile Memory based Deep Learning System by analyzing Data Write Pattern," Journal of the Semiconductor & Display Technology, Vol. 21(3), pp. 1-6, 2022.
Yoon, S. K., and Nah, J. E., "Hybrid Memory Adaptor for OpenStack Swift Object Storage," Journal of the Semiconductor & Display Technology, Vol. 19(3), pp. 61-67, 2020.
Park, S. H., and Park, C. S., "Implementation of GPU Acceleration of Object Detection Application with Drone Video," Journal of the Semiconductor & Display Technology, Vol. 20(3), pp. 117-119, 2021
Pawlowski, J. T., "Hybrid memory cube (HMC)," In 2011 IEEE Hot chips 23 symposium (HCS), pp. 1-24, IEEE, 2011.
Yu, C., Liu, S., and Khan, S., "Multipim: A detailed and configurable multi-stack processing-in-memory simulator," IEEE Computer Architecture Letters, Vol. 20(1), pp. 54-57, 2021. https://doi.org/10.1109/LCA.2021.3061905
Gao, M., Ayers, G., and Kozyrakis, C., "Practical neardata processing for in-memory analytics frameworks," In 2015 International Conference on Parallel Architecture and Compilation (PACT), pp. 113-124, IEEE, 2015.
Min, C., Mao, J., Li, H., and Chen, Y., "NeuralHMC: An efficient HMC-based accelerator for deep neural networks," In Proceedings of the 24th Asia and South Pacific Design Automation Conference, pp. 394-399, 2019.

Journal of the Semiconductor & Display Technology (반도체디스플레이기술학회지)

Performance Analysis and Identifying Characteristics of Processing-in-Memory System with Polyhedral Benchmark Suite

프로세싱 인 메모리 시스템에서의 PolyBench 구동에 대한 동작 성능 및 특성 분석과 고찰

Abstract

Keywords

Acknowledgement

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)