Advanced Topics in Information Retrieval
Search engines have become part of our daily lives. We use Google (Bing, Yandex, Baidu, etc.) as the main gateway to find information on the Web. With a certain type of content in mind, we may search directly on a particular site or service, e.g., on Facebook or LinkedIn for people, organizations, and events; on Amazon or eBay for products; or on YouTube or Spotify for music. Even on our smartphones, we are increasingly reliant on search functionality to find contacts, email, notes, calendar
... ries, apps, etc. We have grown accustomed to expect a search box somewhere near the top of the screen, and we have also increased our expectations of the quality and speed of the responses to our searches. On the highest level of abstraction, the field of information retrieval (IR) is concerned with developing technology for matching information needs with information objects. What we put in the search box, i.e., the query, is an expression of our information need. It may range from a few simple keywords (e.g., "Bond girls") to a proper natural language question (e.g., "What are good digital cameras under $300?"). The search engine then responds with a ranked list of items, i.e., information objects. Traditionally, these items were documents. In fact, IR has been seen as synonymous with document retrieval by many. The past decade, however, has seen an enormous development in search technology. As regular users, we have witnessed first-hand the transitioning of search engines into "answering engines." Today's contemporary web search engines return rich search result pages, which include direct displays of entities, facts, and other structured results instead of merely a list of documents ("ten blue links"), as illustrated in Fig. 1.1 . A primary enabling component behind these advanced search services is the availability of large-scale structured knowledge repositories (called knowledge bases), which organize information around specific things or objects (which we will be referring to as entities). The objective of this book is to give a detailed account of the developments of a decade of IR research that have enabled us to search for "things, not strings."