Improving Accuracy by applying Z-Score Normalization in Linear Regression and Polynomial Regression Model for Real Estate Data

Dimas Aryo Anggoro, Universitas Muhammadiyah Surakarta, Indonesia
2019 International Journal of Emerging Trends in Engineering Research  
This research aimed to prove whether a normalization concept, i.e., z-score normalization was able to improve the accuracy result of algorithms for predicting house prices in Sindian District, New Taipei City, Taiwan. There were specific techniques in data science, e.g., linear regression and polynomial regression, that became the focus of this research. This study used several features that were implemented to model the house price. The features used included the transaction date, the house
more » ... , the distance to the nearest MRT, the number of nearby convenience stores, the latitude geographic coordinate, and the longitude geographic coordinate. The data were then preprocessed by splitting it into a training dataset (75%) and a testing dataset (25%) using a simple random sampling method. The subsequent step was to process the dataset with linear regression and polynomial regression model. Based on the result of the data processing, the optimum order befell to quadratic polynomial regressionmaximum order was 2. This algorithm was then applied to normalized datasets and earned low scores of Mean Squared Error (MSE) and R-squared score, which were 7.044 x 10 -7 and 0.989, respectively. It was concluded that this algorithm combination was the best-performed algorithm for predicting the real estate dataset in Sindian District, New Taipei City, Taiwan.
doi:10.30534/ijeter/2019/247112019 fatcat:2r6eh5diqjekzi3o7ag4ygtcf4