• Title/Summary/Keyword: Catastrophic forgetting

Search Result 12, Processing Time 0.025 seconds

A study on sequential iterative learning for overcoming catastrophic forgetting phenomenon of artificial neural network (인공 신경망의 Catastrophic forgetting 현상 극복을 위한 순차적 반복 학습에 대한 연구)

  • Choi, Dong-bin;Park, Young-beom
    • Journal of Platform Technology
    • /
    • v.6 no.4
    • /
    • pp.34-40
    • /
    • 2018
  • Currently, artificial neural networks perform well for a single task, but NN have the problem of forgetting previous learning by learning other kinds of tasks. This is called catastrophic forgetting. To use of artificial neural networks in general purpose this should be solved. There are many efforts to overcome catastrophic forgetting. However, even though there was a lot of effort, it did not completely overcome the catastrophic forgetting. In this paper, we propose sequential iterative learning using core concepts used in elastic weight consolidation (EWC). The experiment was performed to reproduce catastrophic forgetting phenomenon using EMNIST data set which extended MNIST, which is widely used for artificial neural network learning, and overcome it through sequential iterative learning.

Continual Learning using Data Similarity (데이터 유사도를 이용한 지속적 학습방법)

  • Park, Seong-Hyeon;Kang, Seok-Hoon
    • Journal of IKEEE
    • /
    • v.24 no.2
    • /
    • pp.514-522
    • /
    • 2020
  • In Continuous Learning environment, we identify that the Catastrophic Forgetting phenomenon, which forgets the information of previously learned data, occurs easily between data having different domains. To control this phenomenon, we introduce how to measure the relationship between previously learned data and newly learned data through the distribution of the neural network's output, and how to use these measurements to mitigate the Catastrophic Forcing phenomenon. MNIST and EMNIST data were used for evaluation, and experiments showed an average 22.37% improvement in accuracy for previous data.

Improvement of Catastrophic Forgetting using variable Lambda value in EWC (가변 람다값을 이용한 EWC에서의 치명적 망각현상 개선)

  • Park, Seong-Hyeon;Kang, Seok-Hoon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.1
    • /
    • pp.27-35
    • /
    • 2021
  • This paper proposes a method to mitigate the Catastrophic Forgetting phenomenon in which artificial neural networks forget information on previous data. This method adjusts the Regularization strength by measuring the relationship between previous data and present data. MNIST and EMNIST data were used for performance evaluation and experimented in three scenarios. The experiment results showed a 0.1~3% improvement in the accuracy of the previous task for the same domain data and a 10~13% improvement in the accuracy of the previous task for different domain data. When continuously learning data with various domains, the accuracy of all previous tasks achieved more than 50% and the average accuracy improved by about 7%. This result shows that neural network learning can be properly performed in a CL environment in which data of different domains are successively entered by the method of this paper.

Efficient Path Selection in Continuous Learning Environment (지속적 학습 환경에서 효율적 경로 선택)

  • Park, Seong-Hyeon;Kang, Seok-Hoon
    • Journal of IKEEE
    • /
    • v.25 no.3
    • /
    • pp.412-419
    • /
    • 2021
  • In this paper, we propose a performance improvement of the LwF method using efficient path selection in Continuous Learning Environment. We compare performance and structure with conventional LwF. For comparison, we experiment with performance using MNIST, EMNIST, Fashion MNIST, and CIFAR10 data with different complexity configurations. Experiments show up to 20% improvement in accuracy for each task, which mitigating the Catastrophic Forgetting phenomenon in Continuous Learning environments.

Domain adaptation of Korean coreference resolution using continual learning (Continual learning을 이용한 한국어 상호참조해결의 도메인 적응)

  • Yohan Choi;Kyengbin Jo;Changki Lee;Jihee Ryu;Joonho Lim
    • Annual Conference on Human and Language Technology
    • /
    • 2022.10a
    • /
    • pp.320-323
    • /
    • 2022
  • 상호참조해결은 문서에서 명사, 대명사, 명사구 등의 멘션 후보를 식별하고 동일한 개체를 의미하는 멘션들을 찾아 그룹화하는 태스크이다. 딥러닝 기반의 한국어 상호참조해결 연구들에서는 BERT를 이용하여 단어의 문맥 표현을 얻은 후 멘션 탐지와 상호참조해결을 동시에 수행하는 End-to-End 모델이 주로 연구가 되었으며, 최근에는 스팬 표현을 사용하지 않고 시작과 끝 표현식을 통해 상호참조해결을 빠르게 수행하는 Start-to-End 방식의 한국어 상호참조해결 모델이 연구되었다. 최근에 한국어 상호참조해결을 위해 구축된 ETRI 데이터셋은 WIKI, QA, CONVERSATION 등 다양한 도메인으로 이루어져 있으며, 신규 도메인의 데이터가 추가될 경우 신규 데이터가 추가된 전체 학습데이터로 모델을 다시 학습해야 하며, 이때 많은 시간이 걸리는 문제가 있다. 본 논문에서는 이러한 상호참조해결 모델의 도메인 적응에 Continual learning을 적용해 각기 다른 도메인의 데이터로 모델을 학습 시킬 때 이전에 학습했던 정보를 망각하는 Catastrophic forgetting 현상을 억제할 수 있음을 보인다. 또한, Continual learning의 성능 향상을 위해 2가지 Transfer Techniques을 함께 적용한 실험을 진행한다. 실험 결과, 본 논문에서 제안한 모델이 베이스라인 모델보다 개발 셋에서 3.6%p, 테스트 셋에서 2.1%p의 성능 향상을 보였다.

  • PDF

Nonlinear Function Approximation of Moduled Neural Network Using Genetic Algorithm (유전 알고리즘을 이용한 모듈화된 신경망의 비선형 함수 근사화)

  • 박현철;김성주;김종수;서재용;전홍태
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2001.12a
    • /
    • pp.10-13
    • /
    • 2001
  • Nonlinear Function Approximation of Moduled Neural Network Using Genetic Algorithm Neural Network consists of neuron and synapse. Synapse memorize last pattern and study new pattern. When Neural Network learn new pattern, it tend to forget previously learned pattern. This phenomenon is called to catastrophic inference or catastrophic forgetting. To overcome this phenomenon, Neural Network must be modularized. In this paper, we propose Moduled Neural Network. Modular Neural Network consists of two Neural Network. Each Network individually study different pattern and their outputs is finally summed by net function. Sometimes Neural Network don't find global minimum, but find local minimum. To find global minimum we use Genetic Algorithm.

  • PDF

Adaptive Weight Control for Improvement of Catastropic Forgetting in LwF (LwF에서 망각현상 개선을 위한 적응적 가중치 제어 방법)

  • Park, Seong-Hyeon;Kang, Seok-Hoon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.1
    • /
    • pp.15-23
    • /
    • 2022
  • Among the learning methods for Continuous Learning environments, "Learning without Forgetting" has fixed regularization strengths, which can lead to poor performance in environments where various data are received. We suggest a way to set weights variable by identifying the features of the data we want to learn. We applied weights adaptively using correlation and complexity. Scenarios with various data are used for evaluation and experiments showed accuracy increases by up to 5% in the new task and up to 11% in the previous task. In addition, it was found that the adaptive weight value obtained by the algorithm proposed in this paper, approached the optimal weight value calculated manually by repeated experiments for each experimental scenario. The correlation coefficient value is 0.739, and overall average task accuracy increased. It can be seen that the method of this paper sets an appropriate lambda value every time a new task is learned, and derives the optimal result value in various scenarios.

Advanced LwF Model based on Knowledge Transfer in Continual Learning (지속적 학습 환경에서 지식전달에 기반한 LwF 개선모델)

  • Kang, Seok-Hoon;Park, Seong-Hyeon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.3
    • /
    • pp.347-354
    • /
    • 2022
  • To reduce forgetfulness in continuous learning, in this paper, we propose an improved LwF model based on the knowledge transfer method, and we show its effectiveness by experiment. In LwF, if the domain of the learned data is different or the complexity of the data is different, the previously learned results are inaccurate due to forgetting. In particular, when learning continues from complex data to simple data, the phenomenon tends to get worse. In this paper, to ensure that the previous learning results are sufficiently transferred to the LwF model, we apply the knowledge transfer method to LwF, and propose an algorithm for efficient use. As a result, the forgetting phenomenon was reduced by an average of 8% compared to the existing LwF results, and it was effective even when the learning task became long. In particular, when complex data was first learned, the efficiency was improved more than 30% compared to LwF.

Multi-channel Long Short-Term Memory with Domain Knowledge for Context Awareness and User Intention

  • Cho, Dan-Bi;Lee, Hyun-Young;Kang, Seung-Shik
    • Journal of Information Processing Systems
    • /
    • v.17 no.5
    • /
    • pp.867-878
    • /
    • 2021
  • In context awareness and user intention tasks, dataset construction is expensive because specific domain data are required. Although pretraining with a large corpus can effectively resolve the issue of lack of data, it ignores domain knowledge. Herein, we concentrate on data domain knowledge while addressing data scarcity and accordingly propose a multi-channel long short-term memory (LSTM). Because multi-channel LSTM integrates pretrained vectors such as task and general knowledge, it effectively prevents catastrophic forgetting between vectors of task and general knowledge to represent the context as a set of features. To evaluate the proposed model with reference to the baseline model, which is a single-channel LSTM, we performed two tasks: voice phishing with context awareness and movie review sentiment classification. The results verified that multi-channel LSTM outperforms single-channel LSTM in both tasks. We further experimented on different multi-channel LSTMs depending on the domain and data size of general knowledge in the model and confirmed that the effect of multi-channel LSTM integrating the two types of knowledge from downstream task data and raw data to overcome the lack of data.

Domain-robust End-to-end Task-oriented Dialogue Model based on Dialogue Data Augmentation (대화 데이터 증강에 기반한 도메인에 강건한 종단형 목적지향 대화모델)

  • Kiyoung Lee;Ohwoog Kwon;Younggil Kim
    • Annual Conference on Human and Language Technology
    • /
    • 2022.10a
    • /
    • pp.531-534
    • /
    • 2022
  • 신경망 기반 심층학습 기술은 대화처리 분야에서 대폭적인 성능 개선을 가져왔다. 특히 GPT-2와 같은 대규모 사전학습 언어모델을 백본 네트워크로 하고 특정 도메인 타스크 대화 데이터에 대해서 미세조정 방식으로 생성되는 종단형 대화모델의 경우, 해당 도메인 타스크에 대해서 높은 성능을 내고 있다. 하지만 이런 연구들은 대부분 하나의 도메인에 대해서만 초점을 맞출 뿐 싱글 모델로 두 개 이상의 도메인을 고려하고 있지는 않다. 특히 순차적인 미세 조정은 이전에 학습된 도메인에 대해서는 catastrophic forgetting 문제를 발생시킴으로써 해당 도메인 타스크에 대한 성능 하락이 불가피하다. 본 논문에서는 이러한 문제를 해결하기 위하여 MultiWoz 목적지향 대화 데이터에 오픈 도메인 칫챗 대화턴을 유사도에 기반하여 추가하는 데이터 증강 방식을 통해 사용자 입력 및 문맥에 따라 MultiWoz 목적지향 대화와 오픈 도메인 칫챗 대화를 함께 생성할 수 있도록 하였다. 또한 목적지향 대화와 오픈 도메인 칫챗 대화가 혼합된 대화에서의 시스템 응답 생성 성능을 평가하기 위하여 오픈 도메인 칫챗 대화턴을 수작업으로 추가한 확장된 MultiWoz 평가셋을 구축하였다.

  • PDF