Integrating flexibility and fuzziness into a question driven query model

Abdullah Sarhan, Jon Rokne, Reda Alhajj
2018 Information Sciences  
Data plays an important role in our daily life. Thus, data collection, storage, maintenance and processing continue to attract considerable attention. Data may exist in various formats, ranging from unstructured to structured as the two extremes. Traditionally, researchers and practitioners cooperated and developed various data models which form the main foundation for existing database management systems. The relational data model is still dominating despite the rapid development in the
more » ... ues used for data collection, storage and processing. Further, a relational database management system supports a structured query language (SQL) for data processing, and it is not possible to access and retrieve data from a relational database without knowing how to use SQL. However, the wide usage of relational databases motivated researchers to develop more user friendly interfaces which would allow a larger population of users to access relational databases. Such interfaces range from visual to natural language based. This thesis contributes a question driven query model which falls under the natural language based category. The target is to make databases reachable by a larger population, especially after the Internet increased database availability. The proposed model supports fuzziness where every user is given the freedom to define his/her own understanding of fuzzy terms. The developed system absorbs the fuzzy understanding of each user to utilize it while deciding on the result to be communicated back as answer to the raised question. Data mining techniques are employed to guide users in defining their fuzzy understanding. The developed model is intended to help users to retrieve the data they want from a relational database without expecting them to know SQL. In the current version only questions written in English are allowed. The system handles different types of questions, such as (1) simple questions, (2) complex questions with inner joins and where conditions, (3) questions that involves the usage of aggregate functions (e.g., min, max, etc.), and (4) questions with fuzzy i terms. The reported test results demonstrate the effectiveness of the developed system in handling various types of questions raised by a heterogeneous set of users ranging from professionals to naive.
doi:10.1016/j.ins.2017.11.049 fatcat:gh5f2eyppfbdfd6xzg32v7hmj4