TOWARDS THE DEVELOPMENT OF A CLASSIFICATION MODEL FOR TECHNICAL DOCUMENTS IN KNOWLEDGE DISCOVERY SYSTEMS

2020 Issues in Information Systems  
An important component of knowledge management is the organization of documents for quick and easy access. One effective way of organizing these documents is to group them by a fixed set of specific knowledge categories. For large-scale technical teams, the number of categories can reach thousands or even tens of thousands, which makes this type of cataloging especially useful. Text classification is a sophisticated process that involves data preprocessing, transformation, dimensionality
more » ... on, application of classification techniques, classifier evaluation, and classifier validation. This paper describes the preliminary results from phase one of a design-science research study for the development of a model that can be used for classification of financial software development documentation in knowledge discovery systems using machine learning. Specifically, testing with a small dataset of 64 documents from the Natural Language Technical Project Documentation dataset was conducted to assess the effectiveness of traditional text classification methods. Results indicate limitations to these traditional methods for classifying technical documents. The next steps include evaluating the performance on a bigger dataset of technical documents and testing new deep learning techniques.
doi:10.48009/4_iis_2020_67-72 fatcat:nmgzj3hftzheplda6l7vb4bjga