Using Topic Analysis for Querying Halal Information on Malay Documents

Haslizatul Mohamed Hanum, Zainab Abu Bakar, Nurazzah Abdul Rahman, Marshima Mohd Rosli, Norzilah Musa
2014 Procedia - Social and Behavioral Sciences  
Many documents with descriptions of halal products are available through resources from the Internet web pages. User may enquire for halal-related information through query words and as a result of the query user will be presented list of documents relevant to the query. We investigate on topic analysis techniques such as Latent Semantic Analysis (LSA). For retrieval purposes, frequency-based inverted indexing and latent semantic indexing (LSI) techniques are used to discover the important
more » ... iation of the relationship between terms and terms, terms and documents and documents and documents. Cosine similarity measurement is used to measure the similarity between the query word and terms as well as the documents. We develop a prototype and evaluate the techniques on Malay test collection which contain documents extracted from translated Al-Quran collection, translated hadiths collection and web pages written in Malay language. Results and analysis show that, LSI technique outperformed the exact frequency-based technique despite the longer processing time it took during the indexing. We compare and discuss the result we get from using latent semantic with the result from using conventional frequency analysis.
doi:10.1016/j.sbspro.2014.01.1122 fatcat:mk43be6rynemtpkmbqku2efxam