Latent semantic analysis for text-based research

Peter W. Foltz
1996 Behavoir research methods, instruments & computers  
Latent semantic analysis (LSA) is a statistical model of word usage that permits comparisons of semantic similarity between pieces of textual information. This paper summarizes three experiments that illustrate how LSA may be used in text-based research. Two experiments describe methods for analyzinga subject's essay for determining from what text a subject learned the information and for grading the quality of information cited in the essay. The third experiment describes using LSAto measure
more » ... e coherence and comprehensibility of texts. One of the primary goals in text-comprehension research is to understand what factors influence a reader's ability to extract and retain information from textual material. The typical approach in text-comprehension research is to have subjects read textual material and then have them produce some form of summary, such as answering questions or writing an essay. This summary permits the experimenter to determine what information the subject has gained from the text. To analyze what a subject has learned from a text, the task of the experimenter is to relate what was in the summary to what the subject has read. This permits the subject's representation (cognitive model) of the text to be compared with the representation expressed in the original text. For such an analysis, the experimenter must examine each sentence in the subject's summary and match the information contained in the sentence to the information contained in the texts that were read. Information in the summary that is highly related to information from the texts would indicate that it was likely learned from the text. Nevertheless, matching this information is not easy. It requires scanning through the original texts to locate the information. In addition, since subjects do not write exactly the same words as those that they have read, it is not possible to look for exact matches. Instead, the experimenter must make the match on the basis of the semantic content of the text. . This work has benefited from collaborative research with
doi:10.3758/bf03204765 fatcat:mm5jqph3frbh3okp2s3rqxsoza