Data Enrichment [chapter]

2017 Encyclopedia of Machine Learning and Data Mining  
Before data can be analyzed, they must be organized into an appropriate form. Data preparation is the process of manipulating and organizing data prior to analysis. Data preparation is typically an iterative process of manipulating raw data, which is often unstructured and messy, into a more structured and useful form that is ready for further analysis. Decision trees and decision lists are two popular hypothesis languages, which share quite a few similarities, but also have important
more » ... s with respect to expressivity and learnability. Decision Rule Decision Rule A decision rule is an element (piece) of knowledge, usually in the form of a "if-then statement": if < Condition > then < Action > If its Condition is satisfied (i.e., matches a fact in the corresponding database of a given problem) then its Action (e.g., classification or decision making) is performed. Abstract A decision stump is a Decision Tree, which uses only a single attribute for splitting. For discrete attributes, this typically means that the tree consists only of a single interior node (i.e., the root has only leaves as successor nodes). If the attribute is numerical, the tree may be more complex. Abstract Deep Belief Networks Deep Belief Nets Abstract Deep learning artificial neural networks have won numerous contests in pattern recognition and machine learning. They are now widely used by the worlds most valuable public companies. I review the most popular algorithms for feedforward and recurrent networks and their history. Abstract Dimensionality reduction in an important data pre-processing when dealing with Big Data. We explain how it can be used for speeding up search operation and show applications for time-series datasets.
doi:10.1007/978-1-4899-7687-1_979 fatcat:v22x3o5iuzaa5ghwx5fvqztpg4