Cacao, Cocao, or Cocoa?: Reconciliation of Taxonomic Names in Biodiversity Heritage Library [chapter]

Yi-Yun Cheng, Khanh Linh Hoang, Bertram Ludäscher
2020 Knowledge Organization at the Interface  
The Biodiversity Heritage Library (BHL) currently hosts more than 150 thousand titles, and 57 million OCRscanned pages on biodiversity literature dating back to the 16th century. While great research efforts have been conducted to extract taxonomic names in BHL's literature the issue of name reconciliation has yet to be studied. Through the use case of Theobroma cacao, commonly known as chocolate plants, this research aims at presenting a framework to reconcile species names in BHL by merging
more » ... ternal taxonomies. We demonstrate this by using a logic-based, taxonomy alignment approach to match variations of species and subspecies names of Theobroma cacao from four major biodiversity sources: the Encyclopedia of Life (EoL), Integrated Taxonomic Information System (ITIS), Global Biodiversity Information Facility (GBIF), and the United States Department of Agriculture PLANTS Database (USDA Plants). https://doi.org/10.5771/9783956507762-88 Generiert durch IP '207.241.231.83', am 13.12.2020, 23:09:45. Das Erstellen und Weitergeben von Kopien dieses PDFs ist nicht zulässig.
doi:10.5771/9783956507762-88 fatcat:exzpq3fupjhbzolpjzyujm6oga