Statistical properties of Chinese semantic networks

HaiTao Liu
2009 Science Bulletin  
Almost all language networks in word and syntactic levels are small-world and scale-free. This raises the questions of whether a language network in deeper semantic or cognitive level also has the similar properties. To answer the question, we built up a Chinese semantic network based on a treebank with semantic role (argument structure) annotation and investigated its global statistical properties. The results show that although semantic network is also small-world and scale-free, it is
more » ... nt from syntactic network in hierarchical structure and K-Nearest-Neighbor correlation. semantic network, semantic role, complex network, Chinese, small-world, scale-free Language networks are small-world and scale-free, although they are built based on different principles [1] . Similar global statistical properties, which are shown by language networks, are independent of linguistic structure and typology [1] [2] [3] [4] [5] . If the global properties of language network could not reflect the differences of these structures, how could we consider that these statistical properties are indicators of a language network? Do linguistic structures really influence the statistical properties of a language network? More concretely, does syntactic network have the same properties with semantic or conceptual one? To answer the questions, it seems necessary to investigate the language network based on different linguistic principles or levels. Syntactic networks have been explored in several languages [2, 4, 5] , but the statistical properties of (dynamic) semantic (argument structure) network based on real text have not been reported yet. The study reported in this paper will explore these questions. To investigate statistical properties of semantic network, we built a corpus with semantic role (argument structure) annotation. The final corpus includes 34435 word tokens. Based on the corpus, we built a Chinese semantic network with 5903 nodes. THEORETICAL PHYSICS Considering the close relation between syntactic and semantic structures in a language, it is interesting to observe their differences and similarities from a view of complex network. In a semantic (language) network, a node represents an auto-semantic word, and the edge refers to the semantic relation between two words. Semantic network is an intermediate between syntactic and conceptual network. Therefore semantic networks, in particular, dynamic semantic networks (i.e. based on real language usage or text), are useful to explore the following three questions: the organization of human semantic (or conceptual) knowledge, human performance in semantic processing and the processes of semantic retrieval and search.
doi:10.1007/s11434-009-0467-x fatcat:pb4w4el4yrb2xhh2gordb64sge