• Title/Summary/Keyword: parallel performance

Search Result 2,851, Processing Time 0.03 seconds

PERFORMANCE OF A KNIGHT TOUR PARALLEL ALGORITHM ON MULTI-CORE SYSTEM USING OPENMP

  • VIJAYAKUMAR SANGAMESVARAPPA;VIDYAATHULASIRAMAN
    • Journal of applied mathematics & informatics
    • /
    • v.41 no.6
    • /
    • pp.1317-1326
    • /
    • 2023
  • Today's computers, desktops and laptops were build with multi-core architecture. Developing and running serial programs in this multi-core architecture fritters away the resources and time. Parallel programming is the only solution for proper utilization of resources available in the modern computers. The major challenge in the multi-core environment is the designing of parallel algorithm and performance analysis. This paper describes the design and performance analysis of parallel algorithm by taking the Knight Tour problem as an example using OpenMP interface. Comparison has been made with performance of serial and parallel algorithm. The comparison shows that the proposed parallel algorithm achieves good performance compared to serial algorithm.

A Study on Detection Performance Comparison of Bone Plates Using Parallel Convolution Neural Networks (병렬형 합성곱 신경망을 이용한 골절합용 판의 탐지 성능 비교에 관한 연구)

  • Lee, Song Yeon;Huh, Yong Jeong
    • Journal of the Semiconductor & Display Technology
    • /
    • v.21 no.3
    • /
    • pp.63-68
    • /
    • 2022
  • In this study, we produced defect detection models using parallel convolution neural networks. If convolution neural networks are constructed parallel type, the model's detection accuracy will increase and detection time will decrease. We produced parallel-type defect detection models using 4 types of convolutional algorithms. The performance of models was evaluated using evaluation indicators. The model's performance is detection accuracy and detection time. We compared the performance of each parallel model. The detection accuracy of the model using AlexNet is 97 % and the detection time is 0.3 seconds. We confirmed that when AlexNet algorithm is constructed parallel type, the model has the highest performance.

Implementation and Performance Evaluation of Parallel Programming Translator for High Performance Fortran (High Performance Fortran 병렬 프로그래밍 변환기의 구현 및 성능 평가)

  • Kim, Jung-Gwon;Hong, Man-Pyo;Kim, Dong-Gyu
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.4
    • /
    • pp.901-915
    • /
    • 1999
  • Parallel computers are known to be excellent in performance per cost also satisfying scalability and high performance. However parallel machines have enjoyed limited success because of difficulty in parallel programming and non-portability between parallel machines. Recently, researchers have sought to develop data parallel language that provides machine independent programming systems. Data parallel language such as High Performance Fortran provides a basis to write a parallel program based on a global name space by partitioning data and computation, generating message-passing function. In this paper, we describe the Parallel Programming Translator(PPTran), source-to-source data parallel compiler, generating MPI SPMD parallel program from HPF input program through four phases such as data dependence analysis, partitioning data, partitioning computation, and code generation with explicit message-passing and verify the performance of PPTran

  • PDF

Parallel Contact Treatment and Parallel Performance of Impact Simulation Based on Lagrangian Scheme (Lagrangian 기법에 의한 충돌 해석 시 접촉처리의 병렬화 및 병렬효율 평가)

  • Back, Seung-Hoon;Kim, Seung-Jo;Lee, Min-Hyung
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.30 no.11 s.254
    • /
    • pp.1447-1454
    • /
    • 2006
  • The evaluation of parallel performance of a high speed impact simulation is not an easy task because not only the development of parallel explicit code is difficult but also a large number of processors is not easily accessible. In this paper, the parallel performance of a new Lagrangian FEM impact code carried out on cluster supercomputer has been described in high speed range. In the case of metal sphere impacting to oblique plate, the overall speed-up continuously increases even up to 128 CPUs. Investigation of elapsed time of each part reveals that most of the inefficiency comes from the load imbalance of contact.

Adaptive and optimized agent placement scheme for parallel agent-based simulation

  • Jin, Ki-Sung;Lee, Sang-Min;Kim, Young-Chul
    • ETRI Journal
    • /
    • v.44 no.2
    • /
    • pp.313-326
    • /
    • 2022
  • This study presents a noble scheme for distributed and parallel simulations with optimized agent placement for simulation instances. The traditional parallel simulation has some limitations in that it does not provide sufficient performance even though using multiple resources. The main reason for this discrepancy is that supporting parallelism inevitably requires additional costs in addition to the base simulation cost. We present a comprehensive study of parallel simulation architectures, execution flows, and characteristics. Then, we identify critical challenges for optimizing large simulations for parallel instances. Based on our cost-benefit analysis, we propose a novel approach to overcome the performance constraints of agent-based parallel simulations. We also propose a solution for eliminating the synchronizing cost among local instances. Our method ensures balanced performance through optimal deployment of agents to local instances and an adaptive agent placement scheme according to the simulation load. Additionally, our empirical evaluation reveals that the proposed model achieves better performance than conventional methods under several conditions.

Performance Evaluation of a Parallel DEVS Simulation Environment of P-DEVSIM ++ (병렬 DEVS 시뮬레이션 환경(P-DEVSIM ++) 성능 평가)

  • 성영락
    • Journal of the Korea Society for Simulation
    • /
    • v.2 no.1
    • /
    • pp.31-44
    • /
    • 1993
  • Zeigler's DEVS(Discrete Event Systems Specification) formalism supports formal specification of discrete event systems in a hierarchical , modular manner. Associated are hierarchical, distributed simulation algorithms, called abstract simulators, which interpret dynamics of DEVS models. This paper deals with performance evaluation of P-DEVSIM ++, a parallel simulation environment which implements the DEVS formalism and associated simulation algorithms in a parallel environment. Performance simulator has been developed and used to experiment models of parallel simulation executions in different conditions. The experimental result shows that simulation time depends on both the number of processors in the parallel system and the communication overheads among such processors.

  • PDF

Analaysis and design of redundant parallel manipulators (여유 자유도 병렬형 로봇의 분석 및 설계)

  • Kim, Sung-Bok
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.3 no.5
    • /
    • pp.482-489
    • /
    • 1997
  • This paper presents the analysis of the kinematics and dynamics of redundant parallel manipulators, and provides design guides for advanced parallel mainpulators with high performance. Three types of redundancies are considered which include the redundancies in serial chain, joint actuation, and parallelism. First, the kinematic and dynamic models of a redundant parallel manipulator are obtained in both joint and Cartesian spaces, and the kinematic and dynamic manipulabilities are defined for the performance evaluation. The effects of the three types of redundancies on the kinematic and dynamic performance of a parallel manipulator are then analyzed and compared, providing a set of guides for the design of advanced parallel manipulators. Finally, the simulation results using planer parallel manipulators are given.

  • PDF

Implementation and Performance Analysis of High Performance Computing Library for Parallel Processing (병렬처리를 위한 고성능 라이브러리의 구현과 성능 평가)

  • 김영태;이용권
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.31 no.7
    • /
    • pp.379-386
    • /
    • 2004
  • We designed a portable parallel library HPCL(High Performance Computing Library) with following objectives: (1) to provide a close relationship between the parallel code and the original sequential code that will help future versions of the sequential code and (2) to enhance performance of the parallel code. The library is an interface written in C and Fortran programming languages between MPI(Message Passing Interface) and parallel programs in Fortran. Performance results were determined on clusters of PC's and IBM SP4.

Parallel Performance of Preconditioned Navier-Stokes Code on Myrinet Environment (Myrinet 환경에서 예조건화 Navier-Stokes 코드의 병렬처리 성능)

  • Kim M.-H.;Lee G. S.;Choi J.-Y.;Kim K. S.;Kim S.-L.;Jeung I.-S.
    • 한국전산유체공학회:학술대회논문집
    • /
    • 2001.05a
    • /
    • pp.149-154
    • /
    • 2001
  • Parallel performance of a Myrinet based PC-cluster was tested and compared with a conventional Fast-Ethernet system. A preconditioned Navier-Stokes code was parallelized with domain decomposition technique, and used for the parallel performance test. Speed-up ratio was examined as a major performance parameter depending on the number of processor and the network topology. As was expected, Myrinet system shows a superior parallel performance to the Fast-Ethernet system even with a single network adpater for a dual processor SMP machine. A test for the dependency on problem size also shows that network communication speed is a crucial factor for parallelized computational fluid dynamics analysis and the Myrinet system is a plausible candidate for high performance parallel computing system.

  • PDF

Performance Characterization of Tachyon Supercomputer using Hybrid Multi-zone NAS Parallel Benchmarks (하이브리드 병렬 프로그램을 이용한 타키온 슈퍼컴퓨터의 성능)

  • Park, Nam-Kyu;Jeong, Yoon-Su;Yi, Hong-Suk
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.14 no.1
    • /
    • pp.138-144
    • /
    • 2010
  • Tachyon primary system which introduces recently is a high performance supercomputer that composed with AMD Barcelona nodes. In this paper, we will verify the performance and parallel scalability of TachyonIn by using multi-zone NAS Parallel Benchmark(NPB) which is one of a program with hybrid parallel method. To test performance of hybrid parallel execution, B and C classes of BT-MZ in NPB version 3.3 were used. And the parallel scalability test has finished with Tachyon's 1024 processes. It is the first time in Korea to get a result of hybrid parallel computing calculation using more than 1024 processes. Hybrid parallel method in high performance computing system with multi-core technology like Tachyon describes that it can be very efficient and useful parallel performance benchmarks.