Document Clustering using Learning from Examples

G. ThavasiRaja, R. Malmathanraj, M. Arun
2012 International Journal of Computer Applications  
Information filtering (IF) systems usually filter data items by correlating a set of terms representing the user's interest with similar sets of terms representing the data items. Many techniques have been employed for constructing user profiles automatically, but they usually yield large sets of data. Various dimensionality-reduction techniques can be applied in order to reduce the number of terms in a user query. A new framework is described to classify large scale documents and retrieve the
more » ... ocuments related to the user's query based on the application of trained artificial neural network (ANN) model. Its novel feature is the identification of an optimal set of documents that are relevant to the user. As a case study the government orders issued by Tamil Nadu state government, a state in India are classified according to their semantic similarity. Various neural architectures such as back propagation neural network (BPN), radial basis function (RBF), Learning Vector Quantization (LVQ) and Support vector machines (SVM) are used and their performance evaluation is analyzed.
doi:10.5120/4872-7299 fatcat:gsnr5anizrfy5ggulf3ugsrw7q