Informing the Curious Negotiator: Automatic News Extraction from the Internet [chapter]

Debbie Zhang, Simeon J. Simoff
2006 Lecture Notes in Computer Science  
In negotiation, information acquisition and validation play an important role in the decision making process. In this paper we briefly present the framework of a smart data mining system for providing contextual information from the Internet to a negotiation agent. We then present one of its components in more details -an effective automated technique for extracting relevant articles from news web sites, so that they can be used further by the mining agents. Most current techniques experience
more » ... fficulties to cope with changes in websites structure and formats. The proposed extracting process is completely automatic and independentof web site formats. The technique is based on identifying regularities in both format and content of the news web sites. The algorithms are applicable to both single-and multi-document web sites. Since invalid URLs can cause errors in data extraction, we also present a method for the negotiation agent to estimate the validity of the extracted data based on the frequency of the relevant words in the news title. This paper also presents a new procedure for constructing news data sets of given topics. The extracted news data set is further utilised by the parties involved in negotiation. The information retrieved from the data set can support both human and automatednegotiators.
doi:10.1007/11677437_14 fatcat:mupi2fkh6rh57ptxgiukqcre4m