Hedonic Model for Real Estate Prices: Application to Nova Friburgo-RJ
DOI:
https://doi.org/10.14295/vetor.v30i1.12879Keywords:
Machine Learning, Multi-linear Regression, Real Estate Price PredictionAbstract
With the expansion of the real estate market in the mountainous region of Rio de Janeiro State, an increasing number of people need to deal with real estate purchasing and sale. However, to fair evaluate, a real estate unit is not a simple task and can be influenced by different characteristics. To assist in this task, the present work's objective is to identify the most critical attributes for evaluating a property and build a simple mathematical model that can be used to estimate the value of the property in this region. Data from properties for sale in the city of Nova Friburgo were extracted from online ad portals to build a unique database of real estate data. In these data, variable selection techniques and a multiple linear regression process were applied to obtain a mathematical model that describes prices based on the property's essential characteristics. The obtained results revealed that the most crucial aspect of the evaluation is the property's total area. The developed model could also predict prices with a mean percentage deviation of approximately 25% on the test database.
Downloads
References
SECOVI RIO, “Cenário do Mercado Imobiliário da Região Serrana do Rio de Janeiro - 2018,” 2018. [Online]. [Acesso em 11 setembro 2020].
M. T. A. C. N. A. B. S. N. &. A. V. Steiner, “Métodos estatísticos multivariados aplicados à engenharia de avaliações,” Gestão & Produção, vol. 15, nº 1, pp. 23-32, 2008. Disponível em: https://www.scielo.br/scielo.php?pid=S0104-530X2008000100004&script=sci_arttext&tlng=pt
D. B. Nunes, J. D. P. B. Neto e S. M. d. Freitas, “Modelo de regressão linear múltipla para avaliação do valor de mercado de apartamentos residenciais em Fortaleza, CE,” Ambiente Construído, vol. 19, nº 1, pp. 89-104, 2019. Disponível em: https://www.scielo.br/scielo.php?script=sci_arttext&pid=S1678-86212019000100089&lng=pt&tlng=pt
J. C. Pereira, S. Garson e E. G. d. Araújo, “Construção de um modelo para o preço de venda de casas residenciais na cidade de Sorocaba-SP,” Gestão da Produção, Operações e Sistemas, vol. 7, pp. 153-167, 2012. Disponível em: https://revista.feb.unesp.br/index.php/gepros/article/view/861
V. S. Rosa, P. B. d. Oliveira e R. L. Pinto, “Modelos de precificação para locação e venda de imóveis residenciais na cidade de João Monlevade-MG via regressão linear multivariada,” Gestão da produção, operações e sistemas., vol. 14, nº 3, pp. 151-167, 2019. Disponível em: https://revista.feb.unesp.br/index.php/gepros/article/view/2614
V. Pinto e R. A. S. Fernandes, “Análise de preços hedônicos no mercado imobiliário residencial de Conselheiro Lafaiete, MG,” Interações, vol. 20, nº 2, pp. 627-643, 5 Julho 2019. Disponível em: https://interacoesucdb.emnuvens.com.br/interacoes/article/view/1788
A. M. Yusof e S. Ismail, “Multiple Regressions in Analysing House Price Variations,” Communications of the IBIMA, vol. 2012, pp. 1-9, 28 Maio 2012. Disponível em: https://ibimapublishing.com/articles/CIBIMA/2012/383101/
Z. Yan e L. Zong, “Spatial Prediction of Housing Prices in Beijing Using Machine Learning Algorithms,” em HPCCT & BDAI 2020: Proceedings of the 2020 4th High Performance Computing and Cluster Technologies Conference & 2020 3rd International Conference on Big Data and Artificial Intelligence, Nova Iorque, 2020. Disponível em: https://dl.acm.org/doi/10.1145/3409501.3409543
H. Wu, H. Jiao, Y. Yu, Z. Li, Z. Peng, L. Liu e Z. Zeng, “Influence Factors and Regression Model of Urban Housing Prices Based on Internet Open Access Data,” Sustainability, vol. 10, pp. 1-17, 22 Maio 2018. Disponível em: https://www.mdpi.com/2071-1050/10/5/1676
S. Walfish, “A review of statistical outlier methods,” Pharmaceutical technology, vol. 30, nº 11, p. 82, 2 Novembro 2006. Disponível em: https://www.pharmtech.com/view/review-statistical-outlier-methods
L. Breiman, “Random Forests,” Machine Learning, vol. 45, pp. 5-32, Outubro 2001. Disponível em: https://link.springer.com/article/10.1023/A:1010933404324
U. Grömping, “Variable Importance Assessment in Regression: Linear Regression versus Random Forest,” The American Statistician, vol. 63, nº 4, pp. 308-319, 1 Janeiro 2009. Disponível em: https://www.tandfonline.com/doi/abs/10.1198/tast.2009.08199
K. J. Archer e R. V. Kimes, “Empirical characterization of random forest variable importance measures,” Computational Statistics & Data Analysis, vol. 52, nº 4, pp. 2249-2260, 10 Janeiro 2008. Disponível em: https://www.sciencedirect.com/science/article/abs/pii/S0167947307003076?via%3Dihub
T. N. Lal, O. Chapelle, J. Weston e A. Elisseeff, “Embedded Methods,” em Feature Extraction: Foundations and Applications, 1 ed., I. Guyon, S. Gunn, M. Nikravesh e L. A. Zadeh, Eds., Berlim, Springer-Verlag Berlin Heidelberg, 2006, pp. 137-165. Disponível em: https://link.springer.com/book/10.1007%2F978-3-540-35488-8
R. A. Jonhson e D. W. Wichern, Applied Multivariate Statistical Analysis, 6 ed., Upper Saddle River, Nova Jersey: Pearson, 2007, p. 360. Disponível em: https://www.pearson.com/us/higher-education/product/Johnson-Applied-Multivariate-Statistical-Analysis-6th-Edition/9780131877153.html?
E. R. Mansfield e B. P. Helms, “Detecting Multicollinearity,” The American Statistician, vol. 36, nº 3a, pp. 158-160, 1982. Disponível em: https://www.tandfonline.com/doi/abs/10.1080/00031305.1982.10482818
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot e É. Duchesnay, “Scikit-learn: Machine Learning in Python,” Journal of Machine Learning Research, vol. 12, nº 85, p. 2825−2830, 2011. Disponível em: https://jmlr.csail.mit.edu/papers/v12/pedregosa11a.html
X. Liang, Y. Liu, T. Qiu, Y. Jing e F. Fang, “The effects of locational factors on the housing prices of residential communities: The case of Ningbo, China,” Habitat International, vol. 81, pp. 1-11, 16 Setembro 2018. Disponível em: https://www.sciencedirect.com/science/article/abs/pii/S0197397517311797?via%3Dihub
M. El-Denery e N. I. Rashwan, “Solving Multicollinearity Problem Using Ridge Regression Models,” International Journal of Contemporary Mathematical Sciences, vol. 6, pp. 585-600, 2011. Disponível em: http://www.m-hikari.com/ijcms-2011/9-12-2011/rashwanIJCMS9-12-2011.pdf