• Title/Summary/Keyword: unicode chinese character

Search Result 13, Processing Time 0.023 seconds

Support on Ideograph Characters Search of Unicode Based Information System (정보 시스템의 유니코드 기반 한자 검색 지원)

  • Yoon, So-Young
    • Journal of the Korean Society for information Management
    • /
    • v.24 no.4
    • /
    • pp.375-391
    • /
    • 2007
  • Unicode Han ideograph character set differed from the our principle of the phonetic value ordering in that it followed the principle of KangXi radical-stroke ordering of the characters. Therefore, information system should support ideograph search on precise analysis of materials which consist of korean character (hangul) and ideograph character (hanja). History Information system has been maintaining Hanja(Chinese Character) to Hangul Dictionary, Terminology Dictionary for composition, borrowing, non-ideographic principles, Variant Forms Dictionary, and Recently discovered Chinese Characters List.

A study on Mapping the Unicode based Hangul-Hanja for prescription names in Korean Medicine (처방명 연계를 위한 유니코드 한자 기반의 한글-한자 매핑정보 구축에 관한 연구)

  • Jeon, Byoung-Uk;Kim, An-Na;Kim, Ji-Young;Oh, Yong-Taek;Kim, Chul;Song, Mi-Young;Jang, Hyun-Chul
    • Korean Journal of Oriental Medicine
    • /
    • v.18 no.3
    • /
    • pp.133-139
    • /
    • 2012
  • Objective : UMLS is 'Ontology' which establishes the database for medical terminology by gathering various medical vocabularies representing same fundamental concepts. Method : Although Chinese character are represented in the Chinese part of Korean Unicode system in a computer, writing of Chinese characters is vary depending on Chinese input systems and Chinese writers' levels of knowledge. As the result of this, representation of Chinese writing in a computer will be considerably different from an old Chinese document. Therefore, a meaningful relationship between digital Chinese terminology and translated Korean is necessary in order to build Ontology for Chinese medical terms from Oriental medical prescription in a computer system. Result : This research will present 1:1 mapping information among the Chinese characters used in the Oriental medical prescription with analysis of 'same character different sound' and 'same meaning different shape' in Chinese part of Unicode systems. Conclusions : Furthermore, the research will provide top-down menu of relationship between Chinese term and Korean term in medical prescription with assumption of that the Oriental medical prescription has its own unique meaning.

A study of MeSH Compatibility between Korea and Chinese (한국과 중국의 MeSH 호환성 연구)

  • Kwon, Young-Kyu;Lee, Byung-Wook
    • Journal of the Korean Institute of Oriental Medical Informatics
    • /
    • v.11 no.2
    • /
    • pp.65-82
    • /
    • 2005
  • The findings from this study are summarized as follows: 1. Hangul 2004 has 16,023 Chinese Character codes. Among them, 15,231 Chinese Character codes are searched by DB, the others are unsearchable codes. 2. Among 15,231 Chinese Character codes of Hangul 2004, 2,471 Chinese Character codes are converted into 2,232 Simplified Chinese Character codes by Traditional and Simplified Chinese Character Converting program in Hangul 2004. 3. The 5th edition TCM-MeSH has 6,385 thesauruses and 2,142 kinds of Chinese Characters. 4. If we use Simplified Chinese Character of Hangul 2004 to search for TCM-MeSH, we will find 94.3% of TCM-MeSH. But If we use Traditional Chinese Character of Hangul 2004 to search for TCM-MeSH, we will find only 34.2% of TCM-MeSH.

  • PDF

Consideration of CJK Joint Hanja Unicode when is used in AMI/HDB-3 Line Coding (AMI/HDB-3 회선부호화와 한·중·일 한자 유니코드 체계 고찰)

  • Tai, Dong-Zhen;Hong, Wan Pyo
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.8 no.7
    • /
    • pp.1011-1015
    • /
    • 2013
  • This paper analyses the violation rate of CJK joint Chines character Unicode to the source code rule. In the paper, Chinese character 150ea in Chinese Unicode which have relatively a higher frequency in use of a character was chosen to study. The frequency rate in use of the 150ea characters is about 50% of the total frequency rate of the Chinese characters. The study was applied the AMI/HDB-3 line coding/scrambling and HDLC protocol, According to the analyses, the number of violated characters were 77ea of 150 ea, frequency rate in use 29%. Therefore, when the violated 77ea characters are replaced to the matched character codes to the source coding rule, the processing rate of the line coder can be improved about 37%.

Study on the Chinese Character Use in Acupuncture & Moxibustion Textbook (침구학 교재에서의 한자사용 분석연구)

  • Chae, Han;Hwang, Sang-Moon;Lee, Byung-Wook;Yang, Gi-Young;Lee, Byung-Ryul;Kim, Jae-Kyu
    • Journal of Acupuncture Research
    • /
    • v.27 no.4
    • /
    • pp.187-194
    • /
    • 2010
  • Objectives : There has been a need for establishing operational curriculum for chinese characters and chinese writing used by traditional Korean medicine(TKM), but it was not thoroughly recognized so far. Methods : We analysed the usage of unicode chinese characters of acupuncture & moxibustion textbook to recognize the prerequisite chinese characters for TKM studies as clinical perspectives. Results : It was found that 穴, 經, 鍼, 法, 寸, 部, 分, 刺, 下, 上, 中, 位, 氣, 陽, 灸, 脈, 陰, 治, 足, 主 are the most frequently used 20 chinese characters. We also showed that adequate prerequisite chinese character should be designated for the more efficient education of TKM. Conclusions : This study was the first systematic approach to get essential and prerequisite chinese characters for the education of TKM especially for the acupuncture & moxibustion. The prerequisite characters by this study will be used for the development of KEET (Korean Medicine Education Eligibility Test), entrance exam to the Colleges of Oriental Medicine and textbooks, and educational curriculum of premed students.

A Character Shape Encoding Method to Input Chinese Characters in Old Documents (고문헌 벽자(僻字) 입력을 위한 한자 자형 부호화 방법)

  • Kim, Kiwang
    • Journal of Korean Medical classics
    • /
    • v.32 no.1
    • /
    • pp.105-116
    • /
    • 2019
  • Objectives : There are many secluded Chinese characters - so called Byeokja (僻字) in ancient classic literature, and Chinese characters that are not registered in Unicode and Variant characters (heterogeneous characters) that cannot be found in the current font sets often appear. In order to register all possible Chinese characters including such characters as units of information exchange, this study attempts to propose a method to encode the morphological information of Chinese characters according to certain rules. Methods : This study suggests the methods to encode the connection between the nodules constituting the Chinese character and the coordinates of the nodules. In addition to that, rules for expressing information about curves, expressions of aspect ratios of characters, rules for minimizing coordinate lines, and rules for expressing aggregation status of character components are added. Results : Through the proposed method, it is possible to generate codes of a certain length by extracting only information expressing the morphological configuration of characters. Conclusions : The method of character encoding proposed in this study can be used to distinguish variant characters with small variations in Byeokja, new Chinese characters and character strokes and to store and search them.

Study on the Prerequisite Chinese Characters for Education of Traditional Korean Medicine (한의학 입문을 위한 필수한자 추출 및 분석연구)

  • Chae, Han;Hwang, Sang-Moon;Kwon, Young-Kyu;Baik, Yu-Sang;Shin, Sang-Woo;Yang, Gi-Young;Lee, Byung-Ryul;Kim, Jae-Kyu;Lee, Byung-Wook
    • Journal of Physiology & Pathology in Korean Medicine
    • /
    • v.24 no.3
    • /
    • pp.373-379
    • /
    • 2010
  • There has been a need for establishing operational curriculum for chinese characters and chinese writing used by traditional korean medicine (TKM), but it was not carefully recognized so far. We analysed the frequency of unicode chinese characters from five medical textbooks and showed prerequisite chinese characters for TKM beginners. It was found that 之, 者, 不, 也, 而, 氣, 陽, 陰, 下, 其, 病, 爲, 人, 以, 中, 則, 於, 脈, 上, 故 are the most frequently used 20 chinese characters. We also showed that adequate prerequisite chinese character should be designated for the more efficient education of TKM. This study was the first systematic approach to get essential and prerequisite chinese characters for the education of TKM. The prerequisite characters by this study will be used for the development of KEET (Korean Medicine Education Eligibility Test), entrance exam to the Colleges of Oriental Medicine and textbooks, and educational curriculum of premed students.

송본(宋本) "상한론(傷寒論)"의 한중(韓中) Code 비교(比較) 비교대어송본(比較對於宋本)"상한론(傷寒論)"적한국화중국지(的韓國和中國之)Code

  • Lee, Byeong-Uk;Sin, Sang-U;Kim, Eun-Ha
    • Journal of Korean Medical classics
    • /
    • v.18 no.4 s.31
    • /
    • pp.83-92
    • /
    • 2005
  • 도금일(到今日), 동양의학재한중일삼국수저자기적특색이연구발전이래(東洋醫學在韓中日三國隨著自己的特色而硏究發展而來). 이차저한중일삼국위료동양의학적세계화(而且這韓中日三國爲了東洋醫學的世界化), 과학화부단지재진행협조화노력(科學化不斷地在進行協助和努力). 유우동양의학이경주상료세계화(由于東洋醫學已經走上了世界化), 저삼국재소용적의학술어화기유적나사문헌자료적교류수요취월래월고료(這三國在所用的醫學述語和旣有的那些文獻資料的交流需要就越來越高了). 가시재문헌교류화의학용어적표준화과정중발현료일개흔대적장애, 저개장애취시(這個障碍就是)unicode. 당연(當然)unicode시위료재국가지간사적저사정보령활교류이제정작성적(是爲了在國家之間使的這些情報靈活交流而制定作成的). 가시(可是)unicode제정지전(制定之前), 각국이경위료각각적수구이연발료적합자기적한자계통(各國已經爲了各各的需求而硏發了適合自己的漢子系統). 현재적(現在的)unicode취시용나사기유적한자(就是用那些旣有的漢子)code제정적(制定的). 유우(由于)unicode피저양제정종이산생불소계통상적모순문제(被這樣制定終而産生不少系統上的矛盾問題). 저사문제불근영향도료계통지외(這些問題不僅影響到了系統之外), 이차우조애료한의학정보적령활교류(而且又阻碍了韓醫學情報的靈活交流). 위료해결저사문제(爲了解決這些問題). 본인이상한론위연구대상래비교료일하한국화중국적한자(本人以傷寒論爲硏究對象來比較了一下韓國和中國的漢子)code차이(差異).

  • PDF

A Study on Data Sharing Codes Definition of Chinese in CAI Application Programs (CAI 응용프로그램 작성시 자료공유를 위한 한자 코드 체계 정의에 관한 연구)

  • Kho, Dae-Ghon
    • Journal of The Korean Association of Information Education
    • /
    • v.2 no.2
    • /
    • pp.162-173
    • /
    • 1998
  • Writing a CAI program containing Chinese characters requires a common Chinese character code to share information for educational purposes. A Chinese character code setting needs to allow a mixed use of both vowel and stroke order, to represent Chinese characters in simplified Chinese as well as in Japanese version, and to have a conversion process for data exchange among different sets of Chinese codes. Waste in code area is expected when vowel order is used because heteronyms are recognized as different. However, using stroke order facilitates in data recovery preventing duplicate code generation, though it does not comply with the phonetic rule. We claim that the first and second level Chinese code area needs to be expanded as much as academic and industrial circles have demanded. Also, we assert that Unicode can be a temporary measure for an educational code system due to its interoperability, expandability, and expressivity of character sets.

  • PDF

Distance Measures in HMM Clustering for Large-scale On-line Chinese Character Recognition (대용량 온라인 한자 인식을 위한 클러스터링 거리계산 척도)

  • Kim, Kwang-Seob;Ha, Jin-Young
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.9
    • /
    • pp.683-690
    • /
    • 2009
  • One of the major problems that prevent us from building a good recognition system for large-scale on-line Chinese character recognition using HMMs is increasing recognition time. In this paper, we propose a clustering method to solve recognition speed problem and an efficient distance measure between HMMs. From the experiments, we got about twice the recognition speed and 95.37% 10-candidate recognition accuracy, which is only 0.9% decrease, for 20,902 Chinese characters defined in Unicode CJK unified ideographs.