통합 검색 | Korea Science

Exploring the feasibility of fine-tuning large-scale speech recognition models for domain-specific applications: A case study on Whisper model and KsponSpeech dataset

Jungwon Chang;Hosung Nam
- 말소리와 음성과학
- /
- 제15권3호
- /
- pp.83-88
- /
- 2023
This study investigates the fine-tuning of large-scale Automatic Speech Recognition (ASR) models, specifically OpenAI's Whisper model, for domain-specific applications using the KsponSpeech dataset. The primary research questions address the effectiveness of targeted lexical item emphasis during fine-tuning, its impact on domain-specific performance, and whether the fine-tuned model can maintain generalization capabilities across different languages and environments. Experiments were conducted using two fine-tuning datasets: Set A, a small subset emphasizing specific lexical items, and Set B, consisting of the entire KsponSpeech dataset. Results showed that fine-tuning with targeted lexical items increased recognition accuracy and improved domain-specific performance, with generalization capabilities maintained when fine-tuned with a smaller dataset. For noisier environments, a trade-off between specificity and generalization capabilities was observed. This study highlights the potential of fine-tuning using minimal domain-specific data to achieve satisfactory results, emphasizing the importance of balancing specialization and generalization for ASR models. Future research could explore different fine-tuning strategies and novel technologies such as prompting to further enhance large-scale ASR models' domain-specific performance.
https://doi.org/10.13064/KSSS.2023.15.3.083 인용 PDF

Vision Transformer를 활용한 비디오 분류 성능 향상을 위한 Fine-tuning 신경망 (Fine-tuning Neural Network for Improving Video Classification Performance Using Vision Transformer)

이광엽;이지원;박태룡
- 전기전자학회논문지
- /
- 제27권3호
- /
- pp.313-318
- /
- 2023
본 논문은 Vision Transformer를 기반으로 하는 Video Classification의 성능을 개선하는 방법으로 fine-tuning를 적용한 신경망을 제안한다. 최근 딥러닝 기반 실시간 비디오 영상 분석의 필요성이 대두되고 있다. Image Classification에 사용되는 기존 CNN 모델의 특징상 연속된 Frame에 대한 연관성을 분석하기 어렵다는 단점이 있다. 이와 같은 문제를 Attention 메커니즘이 적용된 Vistion Transformer와 Non-local 신경망 모델을 비교 분석하여 최적의 모델을 찾아 해결하고자 한다. 또한, 전이 학습 방법으로 fine-tuning의 다양한 방법을 적용하여 최적의 fine-tuning 신경망 모델을 제안한다. 실험은 UCF101 데이터셋으로 모델을 학습시킨 후, UTA-RLDD 데이터셋에 전이 학습 방법을 적용하여 모델의 성능을 검증하였다.
https://doi.org/10.7471/ikeee.2023.27.3.313 인용 PDF

제어시스템 튜닝에 의한 발전소 효율향상에 관한 연구 (A Study on Efficiency Improvement by Fine Tuning of Power Plant Control)

김호열;김병철;변승현
- 전기학회논문지
- /
- 제61권10호
- /
- pp.1496-1501
- /
- 2012
A fine tuning on a control system is essential not only for stable operation but also for efficient operation of the power plant. There has been a very few studies on efficiency change by control system tuning. So, it was not clear that if it could be improved or not when the control is stable by fine tuning and how much it could be improved if it works. An accurate algorithm for measurement of the plant efficiency was newly introduced and implemented to measure integrated fuel flow and electricity MW output and to calculate the mean efficiency for given time. As a result, stable operation after fine tuning of control parameters for major controlled variables brought higher efficiency than un-stable operations like a cycling or an oscillation. The plant efficiency has been monitored during various tests and tunings to confirm how much it changes by tuning of the control system on power plant. Now, we can say that the efficiency can be improved in stable operation by fine tuning of the control system.
https://doi.org/10.5370/KIEE.2012.61.10.1496 인용 PDF KSCI

Llama, OPT 모델을 활용한 Supervised Fine Tuning, Reinforcement Learning, Chain-of-Hindsight 성능 비교 (Comparing the performance of Supervised Fine-tuning, Reinforcement Learning, and Chain-of-Hindsight with Llama and OPT models)

이현민;나승훈;임준호;김태형;류휘정;장두성
- 한국정보과학회 언어공학연구회:학술대회논문집(한글 및 한국어 정보처리)
- /
- 한국정보과학회언어공학연구회 2023년도 제35회 한글 및 한국어 정보처리 학술대회
- /
- pp.217-221
- /
- 2023
최근 몇 년 동안, Large Language Model(LLM)의 발전은 인공 지능 연구 분야에서 주요 도약을 이끌어 왔다. 이러한 모델들은 복잡한 자연어처리 작업에서 뛰어난 성능을 보이고 있다. 특히 Human Alignment를 위해 Supervised Fine Tuning, Reinforcement Learning, Chain-of-Hindsight 등을 적용한 언어모델이 관심 받고 있다. 본 논문에서는 위에 언급한 3가지 지시학습 방법인 Supervised Fine Tuning, Reinforcement Learning, Chain-of-Hindsight 를 Llama, OPT 모델에 적용하여 성능을 측정 및 비교한다.
PDF

Fine-tuning BERT Models for Keyphrase Extraction in Scientific Articles

Lim, Yeonsoo;Seo, Deokjin;Jung, Yuchul
- 한국정보기술학회 영문논문지
- /
- 제10권1호
- /
- pp.45-56
- /
- 2020
Despite extensive research, performance enhancement of keyphrase (KP) extraction remains a challenging problem in modern informatics. Recently, deep learning-based supervised approaches have exhibited state-of-the-art accuracies with respect to this problem, and several of the previously proposed methods utilize Bidirectional Encoder Representations from Transformers (BERT)-based language models. However, few studies have investigated the effective application of BERT-based fine-tuning techniques to the problem of KP extraction. In this paper, we consider the aforementioned problem in the context of scientific articles by investigating the fine-tuning characteristics of two distinct BERT models - BERT (i.e., base BERT model by Google) and SciBERT (i.e., a BERT model trained on scientific text). Three different datasets (WWW, KDD, and Inspec) comprising data obtained from the computer science domain are used to compare the results obtained by fine-tuning BERT and SciBERT in terms of KP extraction.
https://doi.org/10.14801/JAITC.2020.10.1.45 인용

사전 학습된 한국어 BERT의 전이학습을 통한 한국어 기계독해 성능개선에 관한 연구 (A Study of Fine Tuning Pre-Trained Korean BERT for Question Answering Performance Development)

이치훈;이연지;이동희
- 한국IT서비스학회지
- /
- 제19권5호
- /
- pp.83-91
- /
- 2020
Language Models such as BERT has been an important factor of deep learning-based natural language processing. Pre-training the transformer-based language models would be computationally expensive since they are consist of deep and broad architecture and layers using an attention mechanism and also require huge amount of data to train. Hence, it became mandatory to do fine-tuning large pre-trained language models which are trained by Google or some companies can afford the resources and cost. There are various techniques for fine tuning the language models and this paper examines three techniques, which are data augmentation, tuning the hyper paramters and partly re-constructing the neural networks. For data augmentation, we use no-answer augmentation and back-translation method. Also, some useful combinations of hyper parameters are observed by conducting a number of experiments. Finally, we have GRU, LSTM networks to boost our model performance with adding those networks to BERT pre-trained model. We do fine-tuning the pre-trained korean-based language model through the methods mentioned above and push the F1 score from baseline up to 89.66. Moreover, some failure attempts give us important lessons and tell us the further direction in a good way.
https://doi.org/10.9716/KITS.2020.19.5.083 인용 PDF KSCI

텍스트 요약을 위한 어텐션 기반 BART 모델 미세조정 (Fine-tuning of Attention-based BART Model for Text Summarization)

안영필;박현준
- 한국정보통신학회논문지
- /
- 제26권12호
- /
- pp.1769-1776
- /
- 2022
긴 문장으로 이루어진 글을 자동으로 요약하는 것은 중요한 기술이다. BART 모델은 이러한 요약 문제에서 좋은 성능을 보여주고 널리 사용되고 있는 모델 중 하나이다. 일반적으로 특정 도메인의 요약 모델을 생성하기 위해서는 큰 데이터세트를 학습한 언어 모델을 그 도메인에 맞게 다시 학습하는 미세조정 작업을 수행한다. 이러한 미세조정은 일반적으로 마지막 전 연결 계층의 노드 수를 변경하는 방식으로 진행된다. 하지만 본 논문에서는 최근 다양한 모델에 적용되어 좋은 성능을 보여주고 있는 어텐션 계층을 추가하는 방법으로 미세조정하는 방법을 제안한다. 제안하는 방법의 성능을 평가하기 위해 미세조정 과정에서 층을 더 깊게 쌓기, 스킵 연결 없는 미세조정 등 다양한 실험을 진행하였다. BART 언어 모델에 스킵 연결을 가진 2개의 어텐션 계층을 추가하였을 때 가장 좋은 성능을 보였다.
https://doi.org/10.6109/jkiice.2022.26.12.1769 인용 PDF KSCI

프라이버시 보호를 위한 오프사이트 튜닝 기반 언어모델 미세 조정 방법론 (Privacy-Preserving Language Model Fine-Tuning Using Offsite Tuning)

정진명;김남규
- 지능정보연구
- /
- 제29권4호
- /
- pp.165-184
- /
- 2023
최근 구글의 BERT, OpenAI의 GPT 등, 언어모델(Language Model)을 사용한 비정형 텍스트 데이터에 대한 딥러닝(Deep Learning) 분석이 다양한 응용에서 괄목할 성과를 나타내고 있다. 대부분의 언어모델은 사전학습 데이터로부터 범용적인 언어정보를 학습하고, 이후 미세 조정(Fine-Tuning) 과정을 통해 다운스트림 태스크(Downstream Task)에 맞추어 갱신되는 방식으로 사용되고 있다. 하지만 최근 이러한 언어모델을 사용하는 과정에서 프라이버시가 침해될 수 있다는 우려가 제기되고 있다. 즉 데이터 소유자가 언어모델의 미세 조정을 수행하기 위해 다량의 데이터를 모델 소유자에게 제공하는 과정에서 데이터의 프라이버시가 침해될 수 있으며, 반대로 모델 소유자가 모델 전체를 데이터 소유자에게 공개하면 모델의 구조 및 가중치가 공개되어 모델의 프라이버시가 침해될 수 있다는 것이다. 이러한 상황에서 프라이버시를 보호하며 언어모델의 미세 조정을 수행하기 위해 최근 오프사이트 튜닝(Offsite Tuning)의 개념이 제안되었으나, 해당 연구는 제안 방법론을 텍스트 분류 모델에 적용하는 구체적인 방안을 제시하지 못했다는 한계를 갖는다. 이에 본 연구에서는 한글 문서에 대한 다중 분류 미세 조정 수행 시, 모델과 데이터의 프라이버시를 보호하기 위해 분류기를 추가한 오프사이트 튜닝을 적용하는 구체적인 방법을 제시한다. 제안 방법론의 성능을 평가하기 위해 AIHub에서 제공하는 ICT, 전기, 전자, 기계, 그리고 의학 총 5개의 대분야로 구성된 약 20만건의 한글 데이터에 대해 실험을 수행한 결과, 제안하는 플러그인 모델이 제로 샷 모델 및 오프사이트 모델에 비해 분류 정확도 측면에서 우수한 성능을 나타냄을 확인하였다.
https://doi.org/10.13088/jiis.2023.29.4.165 인용 PDF

3단 구성의 디지털 DLL 회로 (All Digital DLL with Three Phase Tuning Stages)

박철우;강진구
- 전기전자학회논문지
- /
- 제6권1호
- /
- pp.21-29
- /
- 2002
본 논문에서는 전부 디지털 회로로 구성된 고 해상도의 DLL(Delay Locked Loop)를 제안하였다. 제안된 회로는 위상 검출기, 지연 선택 블록, 그리고 각각의 지연 체인을 가지는 Coarse, Fine 그리고 Ultra Fine 위상조정 블록의 삼 단의 형식으로 되어 있다. 첫 번째 단은 Ultra Fine 위상조정블록으로 고 해상도를 얻기 위하여 Vernier Delay Line을 사용하였다. 두 번째와 세 번째 단은 Coarse와 Fine 위상조정블록으로 각각의 단위 지연 체인을 이루는 단위 지연 소자의 해상도 만큼의 위상 제어를 하게 되며, 두 단은 상당히 비슷한 구조를 이루고 있다. 회로는 HSPICE를 이용하여 공급 전압이 3.3V인 $0.35{\mu}m$ CMOS 공정으로 시뮬레이션 되었다. 시뮬레이션 결과 회로의 해상도를 약 10ps로 높일 수 있었으며, 동작 범위는 250MHz에서 800MHz 이다.
PDF

Wide-Band Fine-Resolution DCO with an Active Inductor and Three-Step Coarse Tuning Loop

Pu, Young-Gun;Park, An-Soo;Park, Joon-Sung;Moon, Yeon-Kug;Kim, Su-Ki;Lee, Kang-Yoon
- ETRI Journal
- /
- 제33권2호
- /
- pp.201-209
- /
- 2011
This paper presents a wide-band fine-resolution digitally controlled oscillator (DCO) with an active inductor using an automatic three-step coarse and gain tuning loop. To control the frequency of the DCO, the transconductance of the active inductor is tuned digitally. To cover the wide tuning range, a three-step coarse tuning scheme is used. In addition, the DCO gain needs to be calibrated digitally to compensate for gain variations. The DCO tuning range is 58% at 2.4 GHz, and the power consumption is 6.6 mW from a 1.2 V supply voltage. An effective frequency resolution is 0.14 kHz. The phase noise of the DCO output at 2.4 GHz is -120.67 dBc/Hz at 1 MHz offset.
https://doi.org/10.4218/etrij.11.0110.0209 인용 PDF KSCI

검색결과 315건 처리시간 0.019초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)