• Title/Summary/Keyword: SeqGAN

Search Result 6, Processing Time 0.023 seconds

Automatic Generation of Korean Poetry using Sequence Generative Adversarial Networks (SeqGAN 모델을 이용한 한국어 시 자동 생성)

  • Park, Yo-Han;Jeong, Hye-Ji;Kang, Il-Min;Park, Cheon-Young;Choi, Yong-Seok;Lee, Kong Joo
    • Annual Conference on Human and Language Technology
    • /
    • 2018.10a
    • /
    • pp.580-583
    • /
    • 2018
  • 본 논문에서는 SeqGAN 모델을 사용하여 한국어 시를 자동 생성해 보았다. SeqGAN 모델은 문장 생성을 위해 재귀 신경망과 강화 학습 알고리즘의 하나인 정책 그라디언트(Policy Gradient)와 몬테카를로 검색(Monte Carlo Search, MC) 기법을 생성기에 적용하였다. 시 문장을 자동 생성하기 위한 학습 데이터로는 사랑을 주제로 작성된 시를 사용하였다. SeqGAN 모델을 사용하여 자동 생성된 시는 동일한 구절이 여러번 반복되는 문제를 보였지만 한국어 텍스트 생성에 있어 SeqGAN 모델이 적용 가능함을 확인하였다.

  • PDF

Applying SeqGAN Algorithm to Software Bug Repair (소프트웨어 버그 정정에 SeqGAN 알고리즘을 적용)

  • Yang, Geunseok;Lee, Byungjeong
    • Journal of Internet Computing and Services
    • /
    • v.21 no.5
    • /
    • pp.129-137
    • /
    • 2020
  • Recently, software size and program code complexity have increased due to application to various fields of software. Accordingly, the existence of program bugs inevitably occurs, and the cost of software maintenance is increasing. In open source projects, developers spend a lot of debugging time when solving a bug report assigned. To solve this problem, in this paper, we apply SeqGAN algorithm to software bug repair. In detail, the SeqGAN model is trained based on the source code. Open similar source codes during the learning process are also used. To evaluate the suitability for the generated candidate patch, a fitness function is applied, and if all test cases are passed, software bug correction is considered successful. To evaluate the efficiency of the proposed model, it was compared with the baseline, and the proposed model showed better repair.

Automatic Generation of Training Corpus for a Sentiment Analysis Using a Generative Adversarial Network (생성적 적대 네트워크를 이용한 감성인식 학습데이터 자동 생성)

  • Park, Cheon-Young;Choi, Yong-Seok;Lee, Kong Joo
    • Annual Conference on Human and Language Technology
    • /
    • 2018.10a
    • /
    • pp.389-393
    • /
    • 2018
  • 딥러닝의 발달로 기계번역, 대화 시스템 등의 자연언어처리 분야가 크게 발전하였다. 딥러닝 모델의 성능을 향상시키기 위해서는 많은 데이터가 필요하다. 그러나 많은 데이터를 수집하기 위해서는 많은 시간과 노력이 소요된다. 본 연구에서는 이미지 생성 모델로 좋은 성능을 보이고 있는 생성적 적대 네트워크(Generative adverasarial network)를 문장 생성에 적용해본다. 본 연구에서는 긍/부정 조건에 따른 문장을 자동 생성하기 위해 SeqGAN 모델을 수정하여 사용한다. 그리고 분류기를 포함한 SeqGAN이 긍/부정 감성인식 학습데이터를 자동 생성할 수 있는지 실험한다. 실험을 수행한 결과, 분류기를 포함한 SeqGAN 모델이 생성한 문장과 학습데이터를 혼용하여 학습할 경우 실제 학습데이터만 학습 시킨 경우보다 좋은 정확도를 보였다.

  • PDF

Identification of Prostate Cancer LncRNAs by RNA-Seq

  • Hu, Cheng-Cheng;Gan, Ping;Zhang, Rui-Ying;Xue, Jin-Xia;Ran, Long-Ke
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.21
    • /
    • pp.9439-9444
    • /
    • 2014
  • Purpose: To identify prostate cancer lncRNAs using a pipeline proposed in this study, which is applicable for the identification of lncRNAs that are differentially expressed in prostate cancer tissues but have a negligible potential to encode proteins. Materials and Methods: We used two publicly available RNA-Seq datasets from normal prostate tissue and prostate cancer. Putative lncRNAs were predicted using the biological technology, then specific lncRNAs of prostate cancer were found by differential expression analysis and co-expression network was constructed by the weighted gene co-expression network analysis. Results: A total of 1,080 lncRNA transcripts were obtained in the RNA-Seq datasets. Three genes (PCA3, C20orf166-AS1 and RP11-267A15.1) showed a significant differential expression in the prostate cancer tissues, and were thus identified as prostate cancer specific lncRNAs. Brown and black modules had significant negative and positive correlations with prostate cancer, respectively. Conclusions: The pipeline proposed in this study is useful for the prediction of prostate cancer specific lncRNAs. Three genes (PCA3, C20orf166-AS1, and RP11-267A15.1) were identified to have a significant differential expression in prostate cancer tissues. However, there have been no published studies to demonstrate the specificity of RP11-267A15.1 in prostate cancer tissues. Thus, the results of this study can provide a new theoretic insight into the identification of prostate cancer specific genes.

Traffic Data Generation Technique for Improving Network Attack Detection Using Deep Learning (네트워크 공격 탐지 성능향상을 위한 딥러닝을 이용한 트래픽 데이터 생성 연구)

  • Lee, Wooho;Hahm, Jaegyoon;Jung, Hyun Mi;Jeong, Kimoon
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.11
    • /
    • pp.1-7
    • /
    • 2019
  • Recently, various approaches to detect network attacks using machine learning have been studied and are being applied to detect new attacks and to increase precision. However, the machine learning method is dependent on feature extraction and takes a long time and complexity. It also has limitation of performace due to learning data imbalance. In this study, we propose a method to solve the degradation of classification performance due to imbalance of learning data among the limit points of detection system. To do this, we generate data using Generative Adversarial Networks (GANs) and propose a classification method using Convolutional Neural Networks (CNNs). Through this approach, we can confirm that the accuracy is improved when applied to the NSL-KDD and UNSW-NB15 datasets.

Characterization of the first mitogenomes of the smallest fish in the world, Paedocypris progenetica, from peat swamp of Peninsular Malaysia, Selangor, and Perak

  • Hussin, NorJasmin;Azmir, Izzati Adilah;Esa, Yuzine;Ahmad, Amirrudin;Salleh, Faezah Mohd;Jahari, Puteri Nur Syahzanani;Munian, Kaviarasu;Gan, Han Ming
    • Genomics & Informatics
    • /
    • v.20 no.1
    • /
    • pp.12.1-12.7
    • /
    • 2022
  • The two complete mitochondrial genomes (mitogenomes) of Paedocypris progenetica, the smallest fish in the world which belonged to the Cyprinidae family, were sequenced and assembled. The circular DNA molecules of mitogenomes P1-P. progenetica and S3-P. progenetica were 16,827 and 16,616 bp in length, respectively, and encoded 13 protein-coding genes, 22 transfer RNA genes, two ribosomal RNA genes, and one control region. The gene arrangements of P. progenetica were identical to those of other Paedocypris species. BLAST and phylogenetic analyses revealed variations in the mitogenome sequences of two Paedocypris species from Perak and Selangor. The circular DNA molecule of P. progenetica yield a standard vertebrate gene arrangement and an overall nucleotide composition of A 33.0%, T 27.2%, C 23.5%, and G 15.5%. The overall AT content of this species was consistent with that of other species in other genera. The negative GC-skew and positive AT-skew of the control region in P. progenetica indicated rich genetic variability and AT nucleotide bias, respectively. The results of this study provide genomic variation information and enhance the understanding of the mitogenome of P. progenetica. They could later deliver highly valuable new insight into data for phylogenetic analysis and population genetics.