DOI QR코드

DOI QR Code

New execution model for CAPE using multiple threads on multicore clusters

  • Do, Xuan Huyen (Information Technology Department, University of Sciences, Hue University) ;
  • Ha, Viet Hai (Office for STIC, University of Education, Hue University) ;
  • Tran, Van Long (Technology and Business Department, Phu Xuan University) ;
  • Renault, Eric (LIGM, University Gustave Eiffel, CNRS, ESIEE Paris)
  • Received : 2020.05.22
  • Accepted : 2020.10.22
  • Published : 2021.10.01

Abstract

Based on its simplicity and user-friendly characteristics, OpenMP has become the standard model for programming on shared-memory architectures. Checkpointing-aided parallel execution (CAPE) is an approach that utilizes the discontinuous incremental checkpointing technique (DICKPT) to translate and execute OpenMP programs on distributed-memory architectures automatically. Currently, CAPE implements the OpenMP execution model by utilizing the DICKPT to distribute parallel jobs and their data to slave machines, and then collects the results after executing these distributed jobs. Although this model has been proven to be effective in terms of performance and compatibility with OpenMP on distributed-memory systems, it cannot fully exploit the capabilities of multicore processors. This paper presents a novel execution model for CAPE that utilizes two levels of parallelism. In the proposed model, we add another level of parallelism in the form of multithreaded processes on slave machines with the goal of better exploiting their multicore CPUs. Initial experimental results presented near the end of this paper demonstrate that this model provides significantly enhanced CAPE performance.

Keywords

Acknowledgement

We would like to express our sincere thanks to the Ministry of Education and Training of Vietnam for funding this research.

References

  1. OpenMP.org, Openmp application programming interface, version 4.5, 2015, Mar. 2020, available at: https://www.openmp.org/wpcontent/uploads/openmp-4.5.pdf.
  2. MPI Forum, Mpi: A message-passing interface standard, version 3.1, 2015, Mar. 2020, available at https://www.mpi-forum.org/docs/mpi-3.1/mpi31-report.pdf.
  3. OpenMP.org, Openmp compilers and tool, 2019, Mar. 2020, available at https://www.openmp.org/resources/openmp-compilers-tools/.
  4. H. V. Hai and R. Eric, Discontinuous incremental: A new approach towards extremely lightweight checkpoints, in Proc. Int. Symp. Comput. Netw. Distrib. Syst. (CNDS 2011) (Tehran, Iran), Feb. 2011, pp. 227-232.
  5. H. V. Hai and R. Eric, Improving performance of cape using discontinuous incremental checkpointing, in Proc. IEEE Int. Conf. High Perform. Comput. Commun. (HPCC 2011) (Alberta, Canada), Sept. 2011, pp. 802-807.
  6. H. V. Hai and R. Eric, Design of a shared-memory model for cape, in Proc. Int. Workshop on OpenMP (IWOMP 2012) (Rome, Italy), June. 2012, pp. 262-266.
  7. H. V. Hai and R. Eric, Design and performance analysis of cape based on discontinuous incremental checkpoints, in Proc. IEEE Pacific Rim Conf. Commun., Comput. Signal Process. (PacRim 2011) (BC, Canada), Aug. 2011, pp. 862-867.
  8. T. V. Long, R. Eric, and H. V. Hai, Analysis and evaluation of the performance of cape, in Proc. IEEE Int. Conf. Scalable Comput. Commun. (ScalCom 2016) (Toulouse, France), July 2016, pp. 620-627.
  9. T. V. Long et al., Design and implementation of a new execution model for cape, in Proc. Int. Symp. Inform. Commun. Technol. (SoICT's 2017) (Nha Trang, Vietnam), Dec. 2017, pp. 453-459.
  10. T. V. Long et al., Time-stamp incremental checkpointing and its application for an optimization of execution model to improve performance of cape, Informatica 43 (2018), 301-311.
  11. D. Margery et al., Kerrighed: A SSI cluster OS running OpenMP, in Proc. European Workshop OpenMP (EWOMP 2003) (Aachen, Germany), 2003.
  12. Y. Ojima et al., Performance of clusterenabled openmp for the scash software distributed shared memory system, in Proc. IEEE/ACM Int. Symp. Clust. Comput. Grid (CCGRID'03) (Tokyo, Japan), May 2003, pp. 450-456.
  13. S. Karlsson, S.-W. Lee, and M. Brorsson, A fully compliant OpenMP implementation on software distributed shared memory, in High Performance Computing-HiPC 2002. Springer, Berlin, Heidelberg, 2002, pp. 195-206.
  14. A. Saa-Garriga, D. Castells-Rufas, and J. Carrabina, Omp2mpi: Automatic mpi code generation from openmp programs, in Proc. Workshop High Perform. Energy Effic. Embed. Syst. (HIP3ES) (Netherlands, Amsterdam), Jan. 2015.
  15. A. C. Jacob et al., Exploiting fine- and coarse-grained parallelism using a directive based approach, in Proc. Int. Workshop OpenMP (IWOMP 2015) (Aachen, Germany), Oct. 2015, pp. 30-41.
  16. L. Huang, B. Chapman, and Z. Liu, Towards a more efficient implementation of OpenMP for clusters via translation to global arrays, Parallel Comput. 31 (2005), 1114-1139. https://doi.org/10.1016/j.parco.2005.03.015
  17. J. P. Hoeinger, Extending OpenMP to clusters, White Paper, Intel Corp., 2006, available at http://www.classcloud.org/grid/raw-attachment/wiki/Osaka/Intel_Extend_OpenMP_Cluster.pdf.
  18. H. V. Hai et al., Creating an easy to use and high performance parallel platform on multi-cores networks, in Proc. Mob., Secur. Programmable Netw. (MSPN 2016) (Paris, France), June 2016.
  19. A. J. Bernstein, Analysis of programs for parallel processing, IEEE Trans, Electr. Comput. EC-15 (1966), no. 5, 757-763. https://doi.org/10.1109/PGEC.1966.264565
  20. J. M. Dorta et al., Implementing OpenMP for clusters on top of mpi, in Recent Advances in Parallel Virtual Machine and Message Passing Interface. EuroPVM/MPI 2005, Springer, Berlin, Heidelberg, 2005, pp. 148-155.
  21. M. Gorman, Understanding the linux virtual memory manager, in Understanding The Linux Virtual Memory Manager, Prentice Hall PTR, Upper Saddle River, NJ, USA, 2004.
  22. Intel Inc., 5-level paging and 5-level ept. white paper. revision 1.1, 2017, Oct. 2019, available at https://software.intel.com/sites/default/files/managed/2b/80/5-level_paging_white_paper.pdf.
  23. Canonical Ltd, Ubuntu manpage: Pthread_create-create a new thread, Oct. 2019, available at http://manpages.ubuntu.com/manpages/bionic/man3/pthread_create.3.htm.
  24. P. Padala, Playing with ptrace, part I, 2002, Oct. 2019, available at https://www.linuxjournal.com/article/6100.
  25. P. Padala, Playing with ptrace, part II, 2002, Oct. 2019, available from https://www.linuxjournal.com/article/6210.
  26. G. M. Amdahl. Validity of the single-processor approach to achieving large scale computing capabilities, 1967, Oct. 2019, available at http://www-inst.eecs.berkeley.edu/~n252/paper/Amdahl.pdf.
  27. J. H. Abdel-Qader and R. S. Walker, Performance evaluation of openmp benchmarks on intel's quad core processors, in Proc. WSEAS Int. Conf. Comput.: Part of the 14th WSEAS CSCC Multiconf., vol. 1, (Stevens Point, WI, USA), July 2010.