WEB SCALE INFORMATION EXTRACTION USING WRAPPER INDUCTION APPROACH

RINA ZAMBAD, JAYANT GADGE
2014 International Journal of Electronics and Electical Engineering  
Information extraction from unstructured, ungrammatical data such as classified listings is difficult because traditional structural and grammatical extraction methods do not apply. The proposed architecture extracts unstructured and un-grammatical data using wrapper induction and show the result in structured format. The source of data will be collected from various post website. The obtained post data pages are processed by page parsing, cleansing and data extraction to obtain new reference
more » ... ts. Reference sets are used for mapping the user search query, which improvised the scale of search on unstructured and ungrammatical post data. We validate our approach with experimental results.
doi:10.47893/ijeee.2014.1121 fatcat:jh5qa2w3offqrcnkonkke7mwcm