Barq: Distributed multilingual internet search engine with focus on Arabic language

T. Rachidi, O. Iraqi, M. Bouzoubaa, A.B. El Khattab, M. El Kourdi, A. Zahi, A. Bensaid
SMC'03 Conference Proceedings. 2003 IEEE International Conference on Systems, Man and Cybernetics. Conference Theme - System Security and Assurance (Cat. No.03CH37483)  
Barq is a distributed multilingual search engine with focus on the Arabic language. The Barq R&D project has involved, over a period of some two years, work on Arabic language processing, Arabic word root extraction, indexing, information retrieval, automatic categorization, focused crawling, distributed computing, distributed database systems, and performance tuning. Barq indexes all documents of the web (and optionally of a particular site) including Word and XML documents that contain at
more » ... t a single word of Arabic in CP1256, UTF-8, ISO8859_6, ASMO 449 or ASMO 708 code set. The documents themselves can contain other Latin-based characters. This paper focuses on describing the architecture and design patterns of Barq; as well as the various types of search that Barq supports. Issues such as Stemming/Arabic root extraction, indexing, ranking, precision and recall measurements, automatic categorization etc., are presented too, but their details are described in other works.
doi:10.1109/icsmc.2003.1243853 dblp:conf/smc/RachidiIBKKZB03 fatcat:rr6j5hgydbf6fol5iqtz4re22a