Extracting semantic information structures from free text law enforcement data

James R. Johnson, Anita Miller, Latifur Khan, Bhavani Thuraisingham
2012 2012 IEEE International Conference on Intelligence and Security Informatics  
A detective distributes information on a current case to his law enforcement peers. He quickly receives a computer generated response with leads identified within hundreds of thousands of previously distributed free text documents from thousands of other detectives. The challenges lie in the nature of free text -unstructured formats, confusing word usage, cut-andpaste additions, abbreviations, inserted html/xml tags, multimedia content, and domain-specific terminology. This research proposes a
more » ... ew data structure, the semantic information structure, which encapsulates the extracted content information on classes of information such as people, vehicles, events, organizations, objects, and locations as well as the contextual information about the connections and measures to enable prioritization of files containing related pieces of content. The structure is organized to be a result of automated natural language processing methods that extract entities, expanded entity phrases and their links which are driven by ontologies, DLSafe rules, abductive hypotheses and semantic composition. Importance and significance measures aid in prioritization. Keywords -semantic information structure; semantic content; semantic context; law enforcement, free text, ontology, abductive reasoning; expanded entity phrase, related information of interes,. I.
doi:10.1109/isi.2012.6284291 dblp:conf/isi/JohnsonMKT12 fatcat:fg2trgajvzbqtksizi2qmp43ra