Type/Token Ratios: what do they really tell us?
Journal of Child Language
Type/Token Ratios have been extensively used in child language research as an index of lexical diversity. This paper shows that the measure has frequently failed to discriminate between children at widely different stages of language development, and that the ratio may in fact fall as children get older. It is suggested here that such effects are caused by a negative, though non-linear, relationship between sample size (i.e. number of tokens) and Type/Token Ratio. Effects of open and closed
... s items are considered and an alternative Verbal Diversity measure is examined. Standardization of the number of tokens before computing Type/Token Ratios is recommended. In an investigation into the language development of 480 children between the ages of three and eight, Templin (1957) compares the total number of different words (types) in 50 consecutive utterances with the total number of words in the same 50 utterances (tokens). She concludes: This ratio is approximately one different word for slightly over every two words uttered. The ratio shows little variation over the age range tested and among subsamples, sex, and SES groups. (Templin 1957: 115) In theory, Type/Token Ratio (TTR) weights range of vocabulary for size of speech sample. This is necessary because it is reasonable to assume that the more words sampled, the greater the probability of finding more different words. The larger the resulting TTR the less repetitive the vocabulary usage; if a speech sample contains 20 words and they are all different we obtain the 'ideal' TTR: 20/20 = roo. On the other hand the sample in which the same [•]