Based on hedonic theory and the available literature, the first step in selecting variables
was to choose the final price of houses, expressed in Euros, as the dependent variable.
The process of selecting the independent variables was more complex and is closely
related to the functional form chosen. Of the 35 variables taken from the sample, the categorical
variables were transformed into dummy variables in such a way that the dummy variable took
value 1 when the given attribute was present and zero in any other case.
Subsequently, we performed an exploratory analysis of the variables available by
creating a correlation matrix to detect the presence of multicollinearity. In a first stage, the
information provided is used to eliminate from the model those variables with clear signs of
multicollinearity.
On the other hand, and to eliminate the effect on the future regression of extreme
values of the variable price, which can affect the analysis, two intervention variables, dummy
variables by definition, were created: "d1" and "d2."
An iterative procedure was followed to obtain the most suitable functional form to carry
out the regression analysis by eliminating variables in successive steps using the hedonic
simple estimation technique, based on the principle of goodness of fit between the three
functional forms most commonly used [linear, log-linear (semilog), double-log].
On the other hand, and given the heteroskedasticity of the error term in the regressions,
we decided to apply White’s method (1980) to all regressions in this work. In this way, by using
the ordinary least squares estimation method, but with a matrix of the estimators’ variance and
covariance consistent even in the presence of heteroskedasticity, we obtained estimations of
the parameters that are not affected by heteroskedastic disturbances.
After performing these steps, and analyzing the results, we found that from the
perspective of the goodness of fit, the best results were obtained using the log-linear functional
form, which was therefore used to establish the hedonic function of house pricing in this work.
Once the initial model was estimated, the regression obtained was improved by
including some variable transformations that improved the goodness of fit as a whole (e.g.
surface area squared and number of bathrooms squared).