Automatically acquiring a semantic network of related concepts

Sean Szumlanski, Fernando Gomez
2010 Proceedings of the 19th ACM international conference on Information and knowledge management - CIKM '10  
We describe the automatic acquisition of a semantic network in which over 7,500 of the most frequently occurring nouns in the English language are linked to their semantically related concepts in the WordNet noun ontology. Relatedness between nouns is discovered automatically from lexical co-occurrence in Wikipedia texts using a novel adaptation of an information theoretic inspired measure. Our algorithm then capitalizes on salient sense clustering among these semantic associates to
more » ... ates to automatically disambiguate them to their corresponding WordNet noun senses (i.e., concepts). The resultant concept-to-concept associations, stemming from 7,593 target nouns, with 17,104 distinct senses among them, constitute a large-scale semantic network with 208,832 undirected edges between related concepts. Our work can thus be conceived of as augmenting the WordNet noun ontology with RelatedTo links. The network, which we refer to as the Szumlanski-Gomez Network (SGN), has been ACKNOWLEDGMENTS I would first like to thank my committee for their countless contributions to my graduate career over the years. Charlie Hughes, Annie Wu, and Valerie Sims have been my teachers, collaborators, co-authors, and mentors. They have impressed me not only with their invaluable intellectual contributions to my work, but also with how tirelessly they work to set their students up for success. The encouraging and giving nature of these individuals has made me want to work harder and give more to my own students and colleagues. I am particularly grateful to my advisor, Fernando Gomez. He has spent countless hours in conversation with me over the years, passing on his knowledge not only of artificial intelligence and computational linguistics, but also of art, philosophy, history, literature, and so many other things. In my early years under his advisement, one was as likely to find us discussing Chomsky as one was to find us discussing Lorca and Franco, Picasso's Guernica, Bach's cantatas, or Goya's witches. I found in him a veritable Abbé Faria (in the Dumasian sense) whose breadth and depth of knowledge helped make me a more educated and wellrounded person. My life is richer for our time together, and I am grateful. Maxine Najle helped facilitate data collection for the perceptions of relatedness study that is included as part of this dissertation. I owe her huge thanks for that and for giving so generously of her time during one of the most demanding semesters of her undergraduate career. I am also grateful to my colleagues from the UCF AI Lab whose discussions, counsel, distractions, and friendship over the years contributed to my education and personal wellbeing. In lexicographic order, they are: Adam Campbell, who inspired me with his diligence and unassuming competence; Adelein Rodriguez, who helped keep me grounded with her vi perspectives on life, AI, and so many things in between; Andy Schwartz, who forged ahead of me on this incredible journey and, in doing so, cleared away some of the thorny brush along the trail and showed me it was possible to reach the goal; Chris Millward, who impressed me with his knack for coming up with creative solutions to problems large and small and by being one of the most level-headed and genuine people I know; Nadeem Mohsin, who made the AI Lab immeasurably more interesting, geeky, and fun by sharing just some of what is stored in his amazingly encyclopedic brain; and Ramya Pradhan, who reminded me that sometimes we have to make sacrifices for the things we want.
doi:10.1145/1871437.1871445 dblp:conf/cikm/SzumlanskiG10 fatcat:zhykl7wwkrfcjdigpe633odjhi