Title: | Heterogeneous Spatial Models |
---|---|
Description: | Spatial heterogeneity can be specified in various ways. 'hspm' is an ambitious project that aims at implementing various methodologies to control for heterogeneity in spatial models. The current version of 'hspm' deals with spatial and (non-spatial) regimes models. In particular, the package allows to estimate a general spatial regimes model with additional endogenous variables, specified in terms of a spatial lag of the dependent variable, the spatially lagged regressors, and, potentially, a spatially autocorrelated error term. Spatial regime models are estimated by instrumental variables and generalized methods of moments (see Arraiz et al., (2010) <doi:10.1111/j.1467-9787.2009.00618.x>, Bivand and Piras, (2015) <doi:10.18637/jss.v063.i18>, Drukker et al., (2013) <doi:10.1080/07474938.2013.741020>, Kelejian and Prucha, (2010) <doi:10.1016/j.jeconom.2009.10.025>). |
Authors: | Gianfranco Piras [aut, cre] , Mauricio Sarrias [aut] |
Maintainer: | Gianfranco Piras <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.1-5 |
Built: | 2024-11-01 03:54:41 UTC |
Source: | https://github.com/gpiras/hspm |
A dataset containing the prices and other attributes of 211 dwelling in Baltimore, MD
baltim
baltim
A data frame with 211 rows and 17 variables:
ID variable
sales price, in 1,000 US dollars (MLS)
number of rooms
1 if detached unit, 0 otherwise
number of bathrooms
1 if patio, 0 otherwise
1 if fireplace, 0 otherwise
1 if air conditioning, 0 otherwise
1 if basement, 0 otherwise
number of stores
number of car space in garage, (0 = no garage)
age of dwellings in years
1 if dwelling is in Baltimore County, 0 otherwise
lot size in hundreds of square feet
interior living space in hundreds of square feet
X coordinate on the Maryland grid
Y coordinate on the Maryland grid
https://geodacenter.github.io/data-and-lab/
Estimation of HSAR models by 2SLS
hsar2sls(formula, data, listw = NULL, index = NULL, nins = 2, ...) ## S3 method for class 'hsar2sls' summary(object, MG = TRUE, ...) ## S3 method for class 'summary.hsar2sls' print(x, digits = max(5, getOption("digits") - 3), ...)
hsar2sls(formula, data, listw = NULL, index = NULL, nins = 2, ...) ## S3 method for class 'hsar2sls' summary(object, MG = TRUE, ...) ## S3 method for class 'summary.hsar2sls' print(x, digits = max(5, getOption("digits") - 3), ...)
formula |
a symbolic description of the model. |
data |
the data of class |
listw |
object. An object of class |
index |
index. |
nins |
numeric. Number of instrument. |
... |
additional arguments passed to |
MG |
logical. If |
x , object
|
an object of class |
digits |
the number of digits |
Estimation of HSAR models by Quasi-Maximum Likelihood
hsarML( formula, data, listw = NULL, index = NULL, gradient = TRUE, average = FALSE, init.values = NULL, print.init = FALSE, otype = c("maxLik", "optim"), ... ) ## S3 method for class 'hsarML' coef(object, ...) ## S3 method for class 'hsarML' summary(object, MG = TRUE, ...) ## S3 method for class 'summary.hsarML' print(x, digits = max(5, getOption("digits") - 3), ...)
hsarML( formula, data, listw = NULL, index = NULL, gradient = TRUE, average = FALSE, init.values = NULL, print.init = FALSE, otype = c("maxLik", "optim"), ... ) ## S3 method for class 'hsarML' coef(object, ...) ## S3 method for class 'hsarML' summary(object, MG = TRUE, ...) ## S3 method for class 'summary.hsarML' print(x, digits = max(5, getOption("digits") - 3), ...)
formula |
a symbolic description of the model. |
data |
the data of class |
listw |
object. An object of class |
index |
index. |
gradient |
logical. Only for testing procedures. Should the analytic gradient be used in the ML optimization procedure? |
average |
logical. Should the sample log-likelihood function be divided by N? |
init.values |
if not |
print.init |
logical. If |
otype |
string. A string indicating whether package |
... |
additional arguments passed to |
MG |
logical. If |
x , object
|
an object of class |
digits |
the number of digits |
The function ivregimes
deals with
the estimation of regime models.
Most of the times the variable identifying the regimes
reveals some spatial aspects of the data (e.g., administrative boundaries).
The model includes exogenous as well as endogenous
variables among the regressors.
ivregimes(formula, data, rgv = NULL, vc = c("homoskedastic", "robust", "OGMM"))
ivregimes(formula, data, rgv = NULL, vc = c("homoskedastic", "robust", "OGMM"))
formula |
a symbolic description of the model of the form |
data |
the data of class |
rgv |
an object of class |
vc |
one of |
The basic (non spatial) model with endogenous variables can be written in a general way as:
where ,
and the
vector
contains the observations
on the dependent variable for the first regime,
and the
vector
(with
)
contains the observations on the dependent variable for the second regime.
The
matrix
and the
matrix
are blocks of a block diagonal matrix,
the vectors of parameters
and
have
dimension
and
, respectively,
is the
matrix of regressors that do not vary by regime,
is a
vector of parameters.
The three matrices
(
),
(
) and
(
)
with corresponding vectors of parameters
,
and
,
contain the endogenous variables.
Finally,
is the
vector of innovations.
The model is estimated by two stage least square.
In particular:
If vc = "homoskedastic"
,
the variance-covariance matrix is estimated by ,
where
,
,
is the matrix of instruments,
and
is the matrix of all exogenous and endogenous variables in the model.
If vc = "robust"
, the variance-covariance matrix is estimated by
,
where
is a diagonal matrix with diagonal elements
,
for
.
Finally, if vc = "OGMM"
, the model is estimated in two steps.
In the first step, the model is estimated by 2SLS yielding
the residuals .
With the residuals, the diagonal matrix
is estimated and is
used to construct the matrix
.
Then
, where
is the vector of all the parameters in the model,
The variance-covariance matrix is:
.
An object of class ivregimes
. A list
of five elements. The first element of the list contains the estimation results. The other elements are needed for printing the results.
Gianfranco Piras and Mauricio Sarrias
data("natreg") form <- HR90 ~ 0 | MA90 + PS90 + RD90 + UE90 | 0 | MA90 + PS90 + RD90 + FH90 + FP89 + GI89 split <- ~ REGIONS mod <- ivregimes(formula = form, data = natreg, rgv = split, vc = "robust") summary(mod) mod1 <- ivregimes(formula = form, data = natreg, rgv = split, vc = "OGMM") summary(mod1) form1 <- HR90 ~ MA90 + PS90 | RD90 + UE90 -1 | MA90 + PS90 | RD90 + FH90 + FP89 + GI89 -1 mod2 <- ivregimes(formula = form1, data = natreg, rgv = split, vc = "homoskedastic") summary(mod2)
data("natreg") form <- HR90 ~ 0 | MA90 + PS90 + RD90 + UE90 | 0 | MA90 + PS90 + RD90 + FH90 + FP89 + GI89 split <- ~ REGIONS mod <- ivregimes(formula = form, data = natreg, rgv = split, vc = "robust") summary(mod) mod1 <- ivregimes(formula = form, data = natreg, rgv = split, vc = "OGMM") summary(mod1) form1 <- HR90 ~ MA90 + PS90 | RD90 + UE90 -1 | MA90 + PS90 | RD90 + FH90 + FP89 + GI89 -1 mod2 <- ivregimes(formula = form1, data = natreg, rgv = split, vc = "homoskedastic") summary(mod2)
Continental U.S. counties data for homicides and selected socio-economic characteristics. Data for four decennial census years: 1960, 1970, 1980 and 1990.
natreg
natreg
A data frame with 3085 rows and 73 variables
Regions of the US
Counties not in the south
Poligon id
Counties names
State name
FIPS code for the state
FIPS code for the county
state and county FIPS code
FIPS code for the state
FIPS code for the county
state + county FIPS code
dummy indicator: 1 if the county is in the southern US
homicide rate per 100,000 in 1960
homicide rate per 100,000 in 1970
homicide rate per 100,000 in 1980
homicide rate per 100,000 in 1990
homicide count, three year average centered on 1960
homicide count, three year average centered on 1970
homicide count, three year average centered on 1980
homicide count, three year average centered on 1990
county population in 1960
county population in 1970
county population in 1980
county population in 1990
resource deprivation in 1960
resource deprivation in 1970
resource deprivation in 1980
resource deprivation in 1990
population structure in 1960
population structure in 1970
population structure in 1980
population structure in 1990
unemployment rate in 1960
unemployment rate in 1970
unemployment rate in 1980
unemployment rate in 1990
divorce rate in 1960: pct. males over 14 divorced
divorce rate in 1970: pct. males over 14 divorced
divorce rate in 1980: pct. males over 14 divorced
divorce rate in 1990: pct. males over 14 divorced
median age in 1960
median age in 1970
median age in 1980
median age in 1990
log of population in 1960
log of population in 1970
log of population in 1980
log of population in 1990
log of population density in 1960
log of population density in 1970
log of population density in 1980
log of population density in 1990
log of median family income in 1959
log of median family income in 1969
log of median family income in 1979
log of median family income in 1989
pct. families below poverty in 1959
pct. families below poverty in 1969
pct. families below poverty in 1979
pct. families below poverty in 1989
pct. black in 1960
pct. black in 1970
pct. black in 1980
pct. black in 1990
Gini index of family income inequality in 1959
Gini index of family income inequality in 1969
Gini index of family income inequality in 1979
Gini index of family income inequality in 1989
pct. female headed households in 1960
pct. female headed households in 1970
pct. female headed households in 1980
pct. female headed households in 1990
West regional dummy
https://geodacenter.github.io/data-and-lab/
The function regimes
deals with
the estimation of regime models.
Most of the times the variable identifying the regimes
reveals some spatial aspects of the data (e.g., administrative boundaries).
regimes(formula, data, rgv = NULL, vc = c("homoskedastic", "groupwise"))
regimes(formula, data, rgv = NULL, vc = c("homoskedastic", "groupwise"))
formula |
a symbolic description of the model of the form |
data |
the data of class |
rgv |
an object of class |
vc |
one of |
For convenience and without loss of generality, we assume the presence of only two regimes. In this case, the basic (non-spatial) is:
where ,
and the
vector
contains the observations
on the dependent variable for the first regime,
and the
vector
(with
)
contains the observations on the dependent variable for the second regime.
The
matrix
and the
matrix
are blocks of a block diagonal matrix,
the vectors of parameters
and
have
dimension
and
, respectively,
is the
matrix of regressors that do not vary by regime,
is a
vector of parameters
and
is the
vector of innovations.
If vc = "homoskedastic"
, the model is estimated by OLS.
If vc = "groupwise"
, the model is estimated in two steps.
In the first step, the model is estimated by OLS. In the second step, the
inverse of the (groupwise) residuals from the first step are employed
as weights in a weighted least square procedure.
An object of class lm
and spregimes
.
Gianfranco Piras and Mauricio Sarrias
data("baltim") form <- PRICE ~ NROOM + NBATH + PATIO + FIREPL + AC + GAR + AGE + LOTSZ + SQFT split <- ~ CITCOU mod <- regimes(formula = form, data = baltim, rgv = split, vc = "groupwise") summary(mod) form <- PRICE ~ AC + AGE + NROOM + PATIO + FIREPL + SQFT | NBATH + GAR + LOTSZ - 1 mod <- regimes(form, baltim, split, vc = "homoskedastic") summary(mod)
data("baltim") form <- PRICE ~ NROOM + NBATH + PATIO + FIREPL + AC + GAR + AGE + LOTSZ + SQFT split <- ~ CITCOU mod <- regimes(formula = form, data = baltim, rgv = split, vc = "groupwise") summary(mod) form <- PRICE ~ AC + AGE + NROOM + PATIO + FIREPL + SQFT | NBATH + GAR + LOTSZ - 1 mod <- regimes(form, baltim, split, vc = "homoskedastic") summary(mod)
The function spregimes
deals
with the estimation of spatial regimes models.
This is a general function that allows the estimation
of various spatial specifications, including the spatial lag regimes model,
the spatial error regimes model, and the spatial SARAR regimes model.
Since the estimation is based on generalized method of moments (GMM),
endogenous variables can be included.
For further information on estimation, see details.
spregimes( formula, data = list(), model = c("sarar", "lag", "error", "ols"), listw, wy_rg = FALSE, weps_rg = FALSE, initial.value = NULL, rgv = NULL, het = FALSE, verbose = FALSE, control = list() ) ## S3 method for class 'spregimes' coef(object, ...) ## S3 method for class 'spregimes' vcov(object, ...) ## S3 method for class 'spregimes' print(x, digits = max(3, getOption("digits") - 3), ...) ## S3 method for class 'spregimes' summary(object, ...) ## S3 method for class 'summary.spregimes' print(x, digits = max(5, getOption("digits") - 3), ...) ## S3 method for class 'spregimes' residuals(object, ...) ## S3 method for class 'spregimes' fitted(object, ...)
spregimes( formula, data = list(), model = c("sarar", "lag", "error", "ols"), listw, wy_rg = FALSE, weps_rg = FALSE, initial.value = NULL, rgv = NULL, het = FALSE, verbose = FALSE, control = list() ) ## S3 method for class 'spregimes' coef(object, ...) ## S3 method for class 'spregimes' vcov(object, ...) ## S3 method for class 'spregimes' print(x, digits = max(3, getOption("digits") - 3), ...) ## S3 method for class 'spregimes' summary(object, ...) ## S3 method for class 'summary.spregimes' print(x, digits = max(5, getOption("digits") - 3), ...) ## S3 method for class 'spregimes' residuals(object, ...) ## S3 method for class 'spregimes' fitted(object, ...)
formula |
a symbolic description of the model of
the form |
data |
the data of class |
model |
should be one of |
listw |
a spatial weighting matrix of class |
wy_rg |
default |
weps_rg |
default |
initial.value |
initial value for the spatial error parameter |
rgv |
an object of class |
het |
heteroskedastic variance-covariance matrix |
verbose |
print a trace of the optimization |
control |
select arguments for the optimization |
object |
an object of class spregimes |
... |
additional arguments |
x |
an object of class spregimes |
digits |
number of digits |
The function spregimes
is a wrapper that allows the
estimation of a general
spatial regimes model.
For convenience and without loss of generality,
we assume the presence of only two regimes.
In this case the general model can be written as:
where
The model includes the spatial lag of the dependent variable,
the spatial lag of the regressors,
the spatial lag of the errors
and, possibly, additional endogenous variables.
The function
spregimes
estimates all of the nested
specifications deriving from this model.
There are, however, some restrictions.
For example, if weps_rg
is set to TRUE,
all the regressors in the model should also vary by regimes.
The estimation of the different models relies heavily
on code available from the package sphet.
For the spatial lag (or Durbin) regimes model (i.e, when
and
are zero), an instrumental variable
procedure is adopted, where the matrix of instruments
is formed by the spatial lags of the exogenous variables
and the additional instruments included in the
formula
.
A robust estimation
of the variance-covariance matrix can be obtained
by setting het = TRUE
.
For the spatial error regime models (i.e, when
and
are zero), the spatial
coefficient(s)
are estimated with the GMM procedure described
in Kelejian and Prucha (2010) and Drukker et al., (2013).
The difference between Kelejian and Prucha (2010) and Drukker et al., (2013),
is that the former assume heteroskedastic innovations (
het = TRUE
),
while the latter does not (het = FALSE
).
For the SARAR regimes model, the estimation procedure
alternates a series of IV and GMM steps. The variance-covariance
can be estimated assuming that the innovations are homoskedastic (het = FALSE
)
as well as heteroskedastic (het = TRUE
).
An object of class “spregimes
”
Gianfranco Piras and Mauricio Sarrias
Arraiz, I. and Drukker, M.D. and Kelejian, H.H. and Prucha, I.R. (2010) A spatial Cliff-Ord-type Model with Heteroskedastic Innovations: Small and Large Sample Results, Journal of Regional Sciences, 50, pages 592–614.
Drukker, D.M. and Egger, P. and Prucha, I.R. (2013) On Two-step Estimation of a Spatial Auto regressive Model with Autoregressive Disturbances and Endogenous Regressors, Econometric Review, 32, pages 686–733.
Kelejian, H.H. and Prucha, I.R. (2010) Specification and Estimation of Spatial Autoregressive Models with Autoregressive and Heteroskedastic Disturbances, Journal of Econometrics, 157, pages 53–67.
Gianfranco Piras (2010). sphet: Spatial Models with Heteroskedastic Innovations in R. Journal of Statistical Software, 35(1), 1-21. doi:10.18637/jss.v035.i01.
Roger Bivand, Gianfranco Piras (2015). Comparing Implementations of Estimation Methods for Spatial Econometrics. Journal of Statistical Software, 63(18), 1-36. doi:10.18637/jss.v063.i18.
Gianfranco Piras, Paolo Postiglione (2022). A deeper look at impacts in spatial Durbin model with sphet. Geographical Analysis, 54(3), 664-684.
Luc Anselin, Sergio J. Rey (2014). Modern Spatial Econometrics in Practice: A Guide to GeoDa, GeoDaSpace and PySal. GeoDa Press LLC.
data("natreg") data("ws_6") form <- HR90 ~ 0 | MA90 + PS90 + RD90 + UE90 | 0 | 0 | MA90 + PS90 + RD90 + FH90 + FP89 + GI89 | 0 form1 <- HR90 ~ MA90 -1 | PS90 + RD90 + UE90 | 0 | MA90 -1 | PS90 + RD90 + FH90 + FP89 + GI89 | 0 form2 <- HR90 ~ MA90 -1 | PS90 + RD90 + UE90 | MA90 | MA90 -1 | PS90 + RD90 + FH90 + FP89 + GI89 | 0 form3 <- HR90 ~ MA90 -1 | PS90 + RD90 + UE90 | MA90 | MA90 -1 | PS90 + RD90 + FH90 + FP89 + GI89 | GI89 form4 <- HR90 ~ MA90 -1 | PS90 + RD90 + UE90 | MA90 + RD90 | MA90 -1 | PS90 + RD90 + FH90 + FP89 + GI89 | GI89 split <- ~ REGIONS ################################################### # Linear model with regimes and lagged regressors # ################################################### mod <- spregimes(formula = form2, data = natreg, rgv = split, listw = ws_6, model = "ols") summary(mod) mod1 <- spregimes(formula = form3, data = natreg, rgv = split, listw = ws_6, model = "ols") summary(mod1) mod2 <- spregimes(formula = form4, data = natreg, rgv = split, listw = ws_6, model = "ols") summary(mod2) ############################### # Spatial Error regimes model # ############################### mod <- spregimes(formula = form, data = natreg, rgv = split, listw = ws_6, model = "error", het = TRUE) summary(mod) mod1 <- spregimes(formula = form, data = natreg, rgv = split, listw = ws_6, model = "error", weps_rg = TRUE, het = TRUE) summary(mod1) mod2 <- spregimes(formula = form1, data = natreg, rgv = split, listw = ws_6, model = "error", het = TRUE) summary(mod2) ############################### # Spatial Lag regimes model # ############################### mod4 <- spregimes(formula = form, data = natreg, rgv = split, listw = ws_6, model = "lag", het = TRUE, wy_rg = TRUE) summary(mod4) mod5 <- spregimes(formula = form1, data = natreg, rgv = split, listw = ws_6, model = "lag", het = TRUE, wy_rg = TRUE) summary(mod5) ############################### # Spatial SARAR regimes model # ############################### mod6 <- spregimes(formula = form, data = natreg, rgv = split, listw = ws_6, model = "sarar", het = TRUE, wy_rg = TRUE, weps_rg = TRUE) summary(mod6) mod7 <- spregimes(formula = form, data = natreg, rgv = split, listw = ws_6, model = "sarar", het = TRUE, wy_rg = FALSE, weps_rg = FALSE) summary(mod7) mod8 <- spregimes(formula = form1, data = natreg, rgv = split, listw = ws_6, model = "sarar", het = TRUE, wy_rg = TRUE, weps_rg = FALSE) summary(mod8)
data("natreg") data("ws_6") form <- HR90 ~ 0 | MA90 + PS90 + RD90 + UE90 | 0 | 0 | MA90 + PS90 + RD90 + FH90 + FP89 + GI89 | 0 form1 <- HR90 ~ MA90 -1 | PS90 + RD90 + UE90 | 0 | MA90 -1 | PS90 + RD90 + FH90 + FP89 + GI89 | 0 form2 <- HR90 ~ MA90 -1 | PS90 + RD90 + UE90 | MA90 | MA90 -1 | PS90 + RD90 + FH90 + FP89 + GI89 | 0 form3 <- HR90 ~ MA90 -1 | PS90 + RD90 + UE90 | MA90 | MA90 -1 | PS90 + RD90 + FH90 + FP89 + GI89 | GI89 form4 <- HR90 ~ MA90 -1 | PS90 + RD90 + UE90 | MA90 + RD90 | MA90 -1 | PS90 + RD90 + FH90 + FP89 + GI89 | GI89 split <- ~ REGIONS ################################################### # Linear model with regimes and lagged regressors # ################################################### mod <- spregimes(formula = form2, data = natreg, rgv = split, listw = ws_6, model = "ols") summary(mod) mod1 <- spregimes(formula = form3, data = natreg, rgv = split, listw = ws_6, model = "ols") summary(mod1) mod2 <- spregimes(formula = form4, data = natreg, rgv = split, listw = ws_6, model = "ols") summary(mod2) ############################### # Spatial Error regimes model # ############################### mod <- spregimes(formula = form, data = natreg, rgv = split, listw = ws_6, model = "error", het = TRUE) summary(mod) mod1 <- spregimes(formula = form, data = natreg, rgv = split, listw = ws_6, model = "error", weps_rg = TRUE, het = TRUE) summary(mod1) mod2 <- spregimes(formula = form1, data = natreg, rgv = split, listw = ws_6, model = "error", het = TRUE) summary(mod2) ############################### # Spatial Lag regimes model # ############################### mod4 <- spregimes(formula = form, data = natreg, rgv = split, listw = ws_6, model = "lag", het = TRUE, wy_rg = TRUE) summary(mod4) mod5 <- spregimes(formula = form1, data = natreg, rgv = split, listw = ws_6, model = "lag", het = TRUE, wy_rg = TRUE) summary(mod5) ############################### # Spatial SARAR regimes model # ############################### mod6 <- spregimes(formula = form, data = natreg, rgv = split, listw = ws_6, model = "sarar", het = TRUE, wy_rg = TRUE, weps_rg = TRUE) summary(mod6) mod7 <- spregimes(formula = form, data = natreg, rgv = split, listw = ws_6, model = "sarar", het = TRUE, wy_rg = FALSE, weps_rg = FALSE) summary(mod7) mod8 <- spregimes(formula = form1, data = natreg, rgv = split, listw = ws_6, model = "sarar", het = TRUE, wy_rg = TRUE, weps_rg = FALSE) summary(mod8)
ws_6 is a spatial weights matrix based on the 6 nearest neighbors for the Continental U.S. counties data for homicides
ws_6
ws_6
A spatial weighting matrix of class Matrix
https://geodacenter.github.io/data-and-lab/