Image databases: What are they and what do they bring to microscopy

J.M. Carazo, P. de Alarcón, M. Chagoyen
2003 Microscopy and Microanalysis  
Databases come into play when large amounts of data are to be organized in such a way that the retrieval of individual pieces of information becomes the main issue. In other words, databases are to simple "repositories" where the information is kept preserved, but complex structures where data items are organized such as they are accessed efficiently. The field of databases is a classic one in computer science and informatics, and many paradigms on how to design these complex structures have
more » ... n put forward over the years. In this way we encounter the so-called "relational databases", "object-oriented databases", and so forth. These terms refer to the basic manner the information is organized, as well as to the functionality that can be embedded within these structures. Of course, a basic understanding of the principle behind these terms is essential and they will be introduced in the tutorial through examples. Some recommended introductory readings can be found in www.cnb.uam.es/~ carazo/msa2003, where a web site with links and references pertinent to the topic of this tutorial will be maintained. Of course, a crucial point is "what kind of information can be properly stored"?. Traditionally, databases have contained text data. Typical examples are those databases containing information of the employees of a firm, or the ones containing our credit history. Even in the Life Science arena the best well known databases contain essentially text data, even if this "text" is somehow special and codifies for the sequence of bases of a gene, the sequence of amino acids of a protein, or triplets of (x, y, z) coordinates corresponding to atomic coordinates. Obviously, in the microscopy field there is ample space for text-based databases. They provide, for instance, with ways to keep track of the experimental imaging conditions of a series of experiments, the person performing them, the project they are attached.... However, the real breakthrough comes when we also incorporate images into those databases, since they are the final microscopy information that needs to be organized. The problem appears, naturally, if that images are not structured entities such as a text, but essentially they are complex binary pieces of data. The traditional way to incorporate them is to define an entity called "blob", from binary large object, which essentially means to treat them as whole, blindly in a way. Of course, treated in this way images cannot be queried directly in a databases, and operations related to, for instance, comparing two images, are not trivial. A further element of discussion appears when databases are tied into the daily operation of a resource in a way in which input to instruments and their outputs all follow a complex workflow in an automatized environment. Certainly, the need to manage complex workflows is not new, and a couple of decades ago the so-called LIMS (Laboratory Information Management Systems) were in use to help keep track of workflows in areas a diverse as the petrochemistry industry and centralized service resources. Still, the "type" of information that could be properly stored and analysed was basically alphanumeric. A new challenge appears when we want to organize the information contained in complex multidimensional images in such a way that not only they kept stored as "blobs", but also that they are analysed while being acquired and the result of this analysis is used as input to subsequent operations in the new automatized resource. Of course, this approach calls for a new modelling of the
doi:10.1017/s1431927603447855 fatcat:nvt362rvanb4zbodlx4p2ekydy