SMU Data Science Review


Due to high barriers to conduct housing market research, many home sellers opt to go to the market with asymmetric information or invest large sums of money into hiring a professional. This research aims to reduce these inefficiencies by proposing a framework that provides sellers with a concrete recommendation on optimal time and price to sell a home to maximize financial gains. The core data used in this research is the NOVA Home Price dataset, which contains 34,973 house listings over multiple years in Northern Virginia. A pipeline of machine learning models, including a linear regression, random forest, XGboost and artificial neural network are trained and evaluated for performance on predicting home close prices. The final model employed is an ensemble of random forest and XGboost and is tested on both a holdout set of Northern Virginia data as well as real estate data scraped from Zillow to introduce some variance. To control for future economic trends, a long-short-term memory model is then trained using temporal data from the Federal Reserve. Finally, the algorithm distills the insights from the disparate models to provide recommendations on optimal time and price to go to market, as well as short-term investments to increase potential gains from sale. The study finds that home features coupled with macro-economic trends can offer home sellers strong recommendations on optimal time and price to list homes. This research is preliminary and should be used as a baseline for future studies.