• 제목/요약/키워드: data field selection

검색결과 404건 처리시간 0.027초

Feature Selection Methodology in Quality Data Mining

  • Soo, Nam-Ho;Halim, Yulius
    • 한국경영과학회:학술대회논문집
    • /
    • 대한산업공학회/한국경영과학회 2004년도 춘계공동학술대회 논문집
    • /
    • pp.698-701
    • /
    • 2004
  • In many literatures, data mining has been used as a utilization of data warehouse and data collection. The biggest utilizations of data mining are for marketing and researches. This is solely because of the data available for this field is usually in large amount. The usability of the data mining is expandable also to the production process. While the object of research of the data mining in marketing is the customers and products, data mining in the production field is object to the so called 4MlE, man, machine, materials, method (recipe) and environment. All of the elements are important to the production process which determines the quality of the product. Because the final aim of the data mining in production field is the quality of the production, this data mining is commonly recognized as quality data mining. As the variables researched in quality data mining can be hundreds or more, it could take a long time to reveal the information from the data warehouse. Feature selection methodology is proposed to help the research take the best performance in a relatively short time. The usage of available simple statistical tools in this method can help the speed of the mining.

  • PDF

사례 선택 기법을 활용한 앙상블 모형의 성능 개선 (Improving an Ensemble Model Using Instance Selection Method)

  • 민성환
    • 산업경영시스템학회지
    • /
    • 제39권1호
    • /
    • pp.105-115
    • /
    • 2016
  • Ensemble classification involves combining individually trained classifiers to yield more accurate prediction, compared with individual models. Ensemble techniques are very useful for improving the generalization ability of classifiers. The random subspace ensemble technique is a simple but effective method for constructing ensemble classifiers; it involves randomly drawing some of the features from each classifier in the ensemble. The instance selection technique involves selecting critical instances while deleting and removing irrelevant and noisy instances from the original dataset. The instance selection and random subspace methods are both well known in the field of data mining and have proven to be very effective in many applications. However, few studies have focused on integrating the instance selection and random subspace methods. Therefore, this study proposed a new hybrid ensemble model that integrates instance selection and random subspace techniques using genetic algorithms (GAs) to improve the performance of a random subspace ensemble model. GAs are used to select optimal (or near optimal) instances, which are used as input data for the random subspace ensemble model. The proposed model was applied to both Kaggle credit data and corporate credit data, and the results were compared with those of other models to investigate performance in terms of classification accuracy, levels of diversity, and average classification rates of base classifiers in the ensemble. The experimental results demonstrated that the proposed model outperformed other models including the single model, the instance selection model, and the original random subspace ensemble model.

공동주택과 오피스 건설현장의 조직원선정 및 평가실태 비교분석 (A Comparative Analysis of Status on the Selection and Evaluation of Field Manager in Apartment and Office Building Project)

  • 이정용;김병래;손창백
    • 한국건설관리학회:학술대회논문집
    • /
    • 한국건설관리학회 2003년도 학술대회지
    • /
    • pp.290-293
    • /
    • 2003
  • 건설산업의 조직운영은 생산성 및 원가절감에 중요한 요소이다. 그러나 현재 조직개선을 위한 노력이 부족할 뿐만 아니라 조직의 평가방법 또한 체계적으로 갖추고 있지 않다. 본 연구는 공동주택 및 오피스의 현장조직원 선정실태 분석을 통하여 현장조직인의 선임방법, 직급, 및 인원수 결정기준, 자격조건등을 종합하여 제시하였고, 현장 조직원의 선정절차를 제안하였다. 또한 현장관리조직원에 활동에 대한 평가실태 분석을 통하여 현장소장에 대한 평가항목 및 배점, 준공현장에 대한 평가항목 및 배점을 제안하였다. 따라서 본 연구의 결과는 건설공사의 현장관리조직원의 선정 및 평가에 대한 기준을 체계적으로 갖추지 못한 건설기입에 효율적인 조직운영을 위한 기초자료 제공함으로써 생산성향상 및 원가절감에 기여하고자 한다.

  • PDF

기존 물류 네트워크 기반에서 크로스 - 도킹 거점선정에 관한 연구 (A Study on Selection of Cross-Docking Center based on Existing Logistics Network)

  • 이인철;이명호;김내헌
    • 산업공학
    • /
    • 제19권1호
    • /
    • pp.26-33
    • /
    • 2006
  • Many Firms consider the application of a cross-docking system to reduce inventory and lead-time. However, most studies mainly concentrate on the design of a cross-docking system. This study presents the method that selects the cross-docking center under the existing logistics network. Describing the operation environment to apply the cross-docking system, the selection criteria of the cross-docking center, and the main constraints of transportation planning under the environment of multi-level logistics network, we define the selection problem of the cross-docking center applied to a logistics field. We also define the simulation model that can analyze variously the cross-docking volume and develop the selection methodology of the cross-docking center. The simulation model presents the algorithm and influence factors of the cross-docking system, the decision criteria of the system, policy parameter, and input data. In addition, this study analyzes the effect of increasing the number of simultaneous receiving and shipping docks, and the efficiency of the overnight transportation and cross-docking by evaluating each scenario after simulating the scenarios with the practical data of the logistics field.

Improving Field Crop Classification Accuracy Using GLCM and SVM with UAV-Acquired Images

  • Seung-Hwan Go;Jong-Hwa Park
    • 대한원격탐사학회지
    • /
    • 제40권1호
    • /
    • pp.93-101
    • /
    • 2024
  • Accurate field crop classification is essential for various agricultural applications, yet existing methods face challenges due to diverse crop types and complex field conditions. This study aimed to address these issues by combining support vector machine (SVM) models with multi-seasonal unmanned aerial vehicle (UAV) images, texture information extracted from Gray Level Co-occurrence Matrix (GLCM), and RGB spectral data. Twelve high-resolution UAV image captures spanned March-October 2021, while field surveys on three dates provided ground truth data. We focused on data from August (-A), September (-S), and October (-O) images and trained four support vector classifier (SVC) models (SVC-A, SVC-S, SVC-O, SVC-AS) using visual bands and eight GLCM features. Farm maps provided by the Ministry of Agriculture, Food and Rural Affairs proved efficient for open-field crop identification and served as a reference for accuracy comparison. Our analysis showcased the significant impact of hyperparameter tuning (C and gamma) on SVM model performance, requiring careful optimization for each scenario. Importantly, we identified models exhibiting distinct high-accuracy zones, with SVC-O trained on October data achieving the highest overall and individual crop classification accuracy. This success likely stems from its ability to capture distinct texture information from mature crops.Incorporating GLCM features proved highly effective for all models,significantly boosting classification accuracy.Among these features, homogeneity, entropy, and correlation consistently demonstrated the most impactful contribution. However, balancing accuracy with computational efficiency and feature selection remains crucial for practical application. Performance analysis revealed that SVC-O achieved exceptional results in overall and individual crop classification, while soybeans and rice were consistently classified well by all models. Challenges were encountered with cabbage due to its early growth stage and low field cover density. The study demonstrates the potential of utilizing farm maps and GLCM features in conjunction with SVM models for accurate field crop classification. Careful parameter tuning and model selection based on specific scenarios are key for optimizing performance in real-world applications.

Habitat selection in the lesser cuckoo, an avian brood parasite breeding on Jeju Island, Korea

  • Yun, Seongho;Lee, Jin-Won;Yoo, Jeong-Chil
    • Journal of Ecology and Environment
    • /
    • 제44권2호
    • /
    • pp.106-114
    • /
    • 2020
  • Background: Determining patterns of habitat use is key to understanding of animal ecology. Approximately 1% of bird species use brood parasitism for their breeding strategy, in which they exploit other species' (hosts) parental care by laying eggs in their nests. Brood parasitism may complicate the habitat requirement of brood parasites because they need habitats that support both their host and their own conditions for breeding. Brood parasitism, through changes in reproductive roles of sex or individual, may further diversify habitat use patterns among individuals. However, patterns of habitat use in avian brood parasites have rarely been characterized. In this study, we categorized the habitat preference of a population of brood parasitic lesser cuckoos (Cuculus poliocephalus) breeding on Jeju Island, Korea. By using compositional analyses together with radio-tracking and land cover data, we determined patterns of habitat use and their sexual and diurnal differences. Results: We found that the lesser cuckoo had a relatively large home range and its overall habitat composition (the second-order selection) was similar to those of the study area; open areas such as the field and grassland habitats accounted for 80% of the home range. Nonetheless, their habitat, comprised of 2.54 different habitats per hectare, could be characterized as a mosaic. We also found sexual differences in habitat composition and selection in the core-use area of home ranges (third-order selection). In particular, the forest habitat was preferentially utilized by females, while underutilized by males. However, there was no diurnal change in the pattern of habitat use. Both sexes preferred field habitats at the second-order selection. At the third-order selection, males preferred field habitats followed by grasslands and females preferred grasslands followed by forest habitats. Conclusions: We suggest that the field and grassland habitats represent the two most important areas for the lesser cuckoo on Jeju Island. Nevertheless, this study shows that habitat preference may differ between sexes, likely due to differences in sex roles, sex-based energy demands, and potential sexual conflict.

시공 단계를 고려한 터널의 역해석에 관한 연구 (Back Analysis of Tunnel for multi-step Construction)

  • 김선명;윤지선
    • 한국지반공학회:학술대회논문집
    • /
    • 한국지반공학회 2000년도 가을 학술발표회 논문집
    • /
    • pp.479-484
    • /
    • 2000
  • The reliable estimation of the system parameters and the accurate prediction of the system behavior are important to design tunnel safely and economically. Therefore, the back analysis using the field measurements data is useful to evaluate the geotechnical parameter for tunnel. In the back analysis method, the selection of initial value and uncertainty of field measurements influence significantly on the analysis result. In this paper, to overcome uncertainty of field measurements, we performed the back analysis using the displacement data gained at each step of excavation and support.

  • PDF

Efficient crosswell EM Tomography using localized nonlinear approximation

  • Kim Hee Joon;Song Yoonho;Lee Ki Ha;Wilt Michael J.
    • 지구물리와물리탐사
    • /
    • 제7권1호
    • /
    • pp.51-55
    • /
    • 2004
  • This paper presents a fast and stable imaging scheme using the localized nonlinear (LN) approximation of integral equation (IE) solutions for inverting electromagnetic data obtained in a crosswell survey. The medium is assumed to be cylindrically symmetric about a source borehole, and to maintain the symmetry a vertical magnetic dipole is used as a source. To find an optimum balance between data fitting and smoothness constraint, we introduce an automatic selection scheme for a Lagrange multiplier, which is sought at each iteration with a least misfit criterion. In this selection scheme, the IE algorithm is quite attractive for saving computing time because Green's functions, whose calculation is a most time-consuming part in IE methods, are repeatedly re-usable throughout the inversion process. The inversion scheme using the LN approximation has been tested to show its stability and efficiency, using both synthetic and field data. The inverted image derived from the field data, collected in a pilot experiment of water-flood monitoring in an oil field, is successfully compared with that derived by a 2.5-dimensional inversion scheme.

유전자 발현 데이터를 이용한 암의 유형 분류 기법 (Cancer-Subtype Classification Based on Gene Expression Data)

  • 조지훈;이동권;이민영;이인범
    • 제어로봇시스템학회논문지
    • /
    • 제10권12호
    • /
    • pp.1172-1180
    • /
    • 2004
  • Recently, the gene expression data, product of high-throughput technology, appeared in earnest and the studies related with it (so-called bioinformatics) occupied an important position in the field of biological and medical research. The microarray is a revolutionary technology which enables us to monitor several thousands of genes simultaneously and thus to gain an insight into the phenomena in the human body (e.g. the mechanism of cancer progression) at the molecular level. To obtain useful information from such gene expression measurements, it is essential to analyze the data with appropriate techniques. However the high-dimensionality of the data can bring about some problems such as curse of dimensionality and singularity problem of matrix computation, and hence makes it difficult to apply conventional data analysis methods. Therefore, the development of method which can effectively treat the data becomes a challenging issue in the field of computational biology. This research focuses on the gene selection and classification for cancer subtype discrimination based on gene expression (microarray) data.

기술 평가 및 선정을 위한 AHP와 DEA 통합 활용 방법: 청정기술에의 적용 (Integrated AHP and DEA method for technology evaluation and selection: application to clean technology)

  • Yu, Peng;Lee, Jang Hee
    • 지식경영연구
    • /
    • 제13권3호
    • /
    • pp.55-77
    • /
    • 2012
  • Selecting promising technology is becoming more and more difficult due to the increased number and complexity. In this study, we propose hybrid AHP/DEA-AR method and hybrid AHP/DEA-AR-G method to evaluate efficiency of technology alternatives based on ordinal rating data collected through survey to technology experts in a certain field and select efficient technology alternative as promising technology. The proposed method normalizes rating data and uses AHP to derive weights to improve the credibility of analysis, then in order to avoid basic DEA models' problems, use DEA-AR and DEA-AR-G to evaluate efficiency of technology alternatives. In this study, we applied the proposed methods to clean technology and compared with the basic DEA models. According to the result of the comparison, we can find that the both proposed methods are excellent in confirming most efficient technology, and hybrid AHP/DEA-AR method is much easier to use in the process of technology selection.

  • PDF