The MIEL system: Uniform interrogation of structured and weakly-structured imprecise data

Ollivier Haemmerlé, Patrice Buche, Rallou Thomopoulos
2007 Journal of Intelligent Information Systems  
We present an information system developed to help assessing the microbiological risk in food. That information system contains experimental results in microbiology, mainly extracted from scientific publications. The increasing amount of the experimental results available and the difficulty to integrate them into a classic relational database schema led us to design a system composed of two distinct subsystems queried through a common interface. The first subsystem is a classic relational
more » ... se. The second subsystem is a database containing weakly-structured pieces of information expressed in terms of conceptual graphs. The data stored in both bases can be fuzzy ones in order to take into account the specificities of the biological information. The uniform query language used on both relational database and conceptual graph database allows the users to express preferences by using fuzzy sets in their queries. The MIEL system is now operational and used by the microbiologists involved in the Sym'Previus French project. 2 Our project has to deal with several important specificities which impact on the one hand the kind of data we have to store, and on the other hand the kind of processings we want to run on these data. As for the kind of data we want to store, their first specificity is that they concern a scientific field of intense activity. It is really difficult to propose the microbiologists a classic database schema which remains up to date for a long time, since their needs are constantly evolving. Their second characteristic is that they can be numerical (values of the different experimental parameters such as the temperature, the pH, . . . ) or symbolic. The symbolic information is often hierarchized (taxonomy of bacteria (Ballows et al., 1992) , of food products (Ireland and Moller, 2000) , . . . ). Finally, the data can be imprecise since they concern complex biological processes; moreover the measurement tools are limited by their internal imprecision. As for the processing on these data, it is important to note that the end-users are non-specialists in computer science. Our tool is dedicated to help microbiologists in their search for information in order to prevent contaminations in food products. Another specific point is that a database containing biological experiments is incomplete as it will never cover all the possible experimental conditions. The risk of empty answers to a query is significant. All these specificities led us to design a data model which presents several original aspects. The first choice we made is to dispatch the data into two distinct bases. The first one is a classic relational database which contains the stable part of our information. The choice of a relational database was made for efficiency and standardization reasons and the schema of that relational database has been designed in close collaboration with microbiologists. But, as modifying the schema of such a database is quite an expensive operation, we decided to use an additional base in order to store information that was not expected when the schema of the database was designed, but appears to be useful nevertheless. We chose to represent this part of the data -which we call "weakly-structured" -in the conceptual graph model (Sowa, 1984) for many reasons: (i) its graph structure appeared as a flexible way of representing complementary information; (ii) its readability seemed to us an asset since we had to work with non-specialists; (iii) its interpretation in first order logic provided a robust theoretical framework; (iv) a development platform providing efficient algorithms was available; (v) the distinction between the terminological part and the assertional part of the knowledge (as in Description Logics, for example) was valuable to implement query relaxation mechanisms.
doi:10.1007/s10844-006-0014-z fatcat:34k4cwj5kfby5ep32nbnqagvxm