Introductory Econometrics. Chapter 16

Chapter 16: Simultaneous Equations Models

While in the previous chapters we discussed the solution to endogeneity arising from omitted variables and mismeasurement, we have not touched upon the issue of simultaneity. Simultaneity is a situation in which explanatory variables are jointly determined with the dependent variable. In this situation, we use simultaneous equations models. The most popular method to estimate simulation equations models is the method of instrumental variables.

As the name suggests a simultaneous equations models have multiple equations or a system of equations. Each equation in the system should have a cateris paribus interpretation.

The observed data only gives us only the equilibrium outcomes. Thus, when building a simultaneous equations model, we need to use counterfactual reasoning.

The classic example of simultaneous equation model is a supply and demand model for a particular good or service. SEM or simultaneous equation model can also be used to study the effect of law enforcement size on the murder rate. A simple cross-sectional model to address this issue can be stated as

\[ murder.pc=\alpha_1 (police.pc)+\beta_{10}+\beta_{11}(inc.pc)+u_1 \]

Here, $murder.pc$ is the murder rate in the city per capita, $police.pc$ is the number of police officers per capita, and $inc.pc$ is the income per capita.

Such model alone has a major issue. Police force size depends on the crime rate. There is a two-way or simultaneous relationship between crime and the size of police force. To address this issue, we can consider another relationship:

\[ pol.pc=\alpha_2 (murder.pc) +\beta_{20} (Other) + u_2 \]

You can now estimate the effects of additional police force on the murder rates. [We hold of the actual estimation until later.] We call such equations structural equations as we postulate them using economic theory. In this model, $murder.pc$ and $police.pc$ are endogenous (determined within the model) while $inc.pc$ and $Other$ are exogenous or determined outside the model.

First, we need to understand a simultaneity bias in OLS when dealing with simultaneous equations models. Consider a two-equation structural model (we supress the intercept for simplicity).

\[ y_1=\alpha_1 y_2 + \beta_1 z_1 + u_1 \] \[ y_2=\alpha_2 y_1 + \beta_2 z_2 + u_2 \]

We can solve the model using substitution.

\[ y_2=\alpha_2 y_1 + \beta_2 z_2 + u_2 \] \[ y_2= \alpha_2 (\alpha_1 y_2 + \beta_1 z_1 + u_1) + \beta_2 z_2 + u_2 \] Rearranging yields \[ (1-\alpha_2 \alpha_1)y_2= \alpha_2 \beta_1 z_1 + \beta_2 z_2 + \alpha_2 u_1 + u_2 \]

Assuming that $\alpha_2 \alpha_1 \neq 0$, we can solve for $y_2$. Dividing by $(1-\alpha_2 \alpha_1)$ gives us \[ y_2= \frac{\alpha_2 \beta_1}{(1-\alpha_2 \alpha_1)} z_1 + \frac{\beta_2}{(1-\alpha_2 \alpha_1)} z_2 + \frac{\alpha_2 u_1 + u_2}{(1-\alpha_2 \alpha_1)} \]

This we call a reduced form equation for $y_2$; $\frac{\alpha_2 \beta_1}{(1-\alpha_2 \alpha_1)}$ and $\frac{\beta_2}{(1-\alpha_2 \alpha_1)}$ – reduced form parameters. The reduced form error is $\frac{\alpha_2 u_1 + u_2}{(1-\alpha_2 \alpha_1)}$. Since $u_1$ and $u_2$ are ucorrelated with $z_1$ and $z_2$, the reduced form error is also uncorrelated with $z_1$ and $z_2$ and thus we estimate the regression by OLS.

Because $z_1$ and $u_1$ are uncorrelated by assumption, the issue is whether $y_2$ and $u_1$ are uncorrelated. When $y_2$ is correlated with $u_1$ because of simultaneity, we say that OLS suffers from simultaneity bias. In simple models you may be able to determine the direction of the bias but in more complex models (with multiple explanatory variables) this can be complicated.

As with endogeineity due to mismeasurement or omitted variables, 2SLS can be applied to treat simultaneity in simultaneous equation models.

The key condition in OLS is the that each explanatory variable is uncorrelated with the error term. While this condition may not hold with simultaneous equation models, we can still identify or consistently estimate the parameters in the model if we have some instrumental variables.

Consider a simple demand ($q_d$) and supply ($q_s$) model. \[ q_s=\alpha_1 p + \beta_1 z_1 + u_1\] \[ q_d=\alpha_2 p + u_2\]

We have a variable $z_1$ that shifts the supply equation but does not affect the demand. As we vary $z_1$, supply function shifts but the demand function does not, thus the observed price and quantity outcomes must be on the same demand function. Varying $z_1$ alows us to find the demand function. Then the supply function can be identified. In general, SEM must satisfy the order condition to be identified. The order condition states that at least one exogenous variable is excluded from an identified equation.

Consider a labor supply model for married working women. The hours a woman works is a function of wage and other factors (supply function). The demand function is the wage offer to the woman which depends on her education and experience. See the two structural equations below.

\[hours = \alpha_1 log(wage) + \beta_{10} + \beta_{11}educ + \beta_{12}age + \beta_{13}kidslt6 + \beta_{14}nwifeinc + u_1\]

\[log(wage) = \alpha_2 hours + \beta_{20} + \beta_{21}educ + \beta_{22}exper + \beta_{23} exper^2 + u_2\]

Using substitution we can derive the following.

\[log(wage) = \alpha_2 (\alpha_1 log(wage) + \beta_{10} + \beta_{11}educ + \beta_{12}age + \beta_{13}kidslt6 + \beta_{14}nwifeinc + u_1) + \\ \beta_{20} + \beta_{21}educ + \beta_{22}exper + \beta_{23} exper^2 + u_2\]

Solving further yields the following.

\[ log(wage) = \frac{\beta_{20}+\alpha_2 \beta_{10}}{1-\alpha_1 \alpha_2} + \frac{\alpha_2 \beta_{11} +\beta_{21}}{1-\alpha_1 \alpha_2} educ + \frac{\alpha_2 \beta_{12}}{1-\alpha_1 \alpha_2} age + \frac{\alpha_2 \beta_{13}}{1-\alpha_1 \alpha_2} kidslt6 + \frac{\alpha_2 \beta_{14}}{1-\alpha_1 \alpha_2} nwifeinc + \\ \frac{\beta_{22}}{1-\alpha_1 \alpha_2} exper + \frac{\beta_{23}}{1-\alpha_1 \alpha_2} exper^2 + \frac{\alpha_2 u_1 + u_2}{1-\alpha_1 \alpha_2} \] For brevity, we can simply write the following.

\[ log(wage) = \pi_{20} + \pi_{21} educ + \pi_{22} age + \pi_{23} kidslt6 + \pi_{24} nwifeinc + \\ \pi_{25} exper + \pi_{26} exper^2 + v_2 \]

The simultaneous equations model above can be estimated by two stage least squares using the R code below.

library(systemfit)
data(mroz, package='wooldridge')

eq.hrs   = hours    ~ log(wage)+educ+age+kidslt6+nwifeinc
eq.wage  = log(wage)~ hours    +educ+exper+I(exper^2)
eq.system= list(eq.hrs, eq.wage)
instrum  = ~educ+age+kidslt6+nwifeinc+exper+I(exper^2)
summary(systemfit(eq.system,inst=instrum,
                  data=subset(mroz,!is.na(wage)),
                  method="2SLS"))

## 
## systemfit results 
## method: 2SLS 
## 
##          N  DF       SSR detRCov   OLS-R2 McElroy-R2
## system 856 845 773893309  155089 -2.00762   0.748802
## 
##       N  DF         SSR         MSE        RMSE        R2    Adj R2
## eq1 428 422 7.73893e+08 1.83387e+06 1354.204541 -2.007617 -2.043253
## eq2 428 423 1.95266e+02 4.61621e-01    0.679427  0.125654  0.117385
## 
## The covariance matrix of the residuals
##             eq1         eq2
## eq1 1833869.938 -831.542690
## eq2    -831.543    0.461621
## 
## The correlations of the residuals
##           eq1       eq2
## eq1  1.000000 -0.903769
## eq2 -0.903769  1.000000
## 
## 
## 2SLS estimates for 'eq1' (equation 1)
## Model Formula: hours ~ log(wage) + educ + age + kidslt6 + nwifeinc
## Instruments: ~educ + age + kidslt6 + nwifeinc + exper + I(exper^2)
## 
##               Estimate Std. Error  t value   Pr(>|t|)    
## (Intercept) 2225.66182  574.56412  3.87365 0.00012424 ***
## log(wage)   1639.55561  470.57568  3.48415 0.00054535 ***
## educ        -183.75128   59.09981 -3.10917 0.00200323 ** 
## age           -7.80609    9.37801 -0.83238 0.40566404    
## kidslt6     -198.15429  182.92914 -1.08323 0.27932497    
## nwifeinc     -10.16959    6.61474 -1.53741 0.12494167    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1354.204541 on 422 degrees of freedom
## Number of observations: 428 Degrees of Freedom: 422 
## SSR: 773893113.843842 MSE: 1833869.938019 Root MSE: 1354.204541 
## Multiple R-Squared: -2.007617 Adjusted R-Squared: -2.043253 
## 
## 
## 2SLS estimates for 'eq2' (equation 2)
## Model Formula: log(wage) ~ hours + educ + exper + I(exper^2)
## Instruments: ~educ + age + kidslt6 + nwifeinc + exper + I(exper^2)
## 
##                 Estimate   Std. Error  t value   Pr(>|t|)    
## (Intercept) -0.655725440  0.337788292 -1.94123   0.052894 .  
## hours        0.000125900  0.000254611  0.49448   0.621223    
## educ         0.110330004  0.015524358  7.10690 5.0768e-12 ***
## exper        0.034582356  0.019491555  1.77422   0.076746 .  
## I(exper^2)  -0.000705769  0.000454080 -1.55428   0.120865    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.679427 on 423 degrees of freedom
## Number of observations: 428 Degrees of Freedom: 423 
## SSR: 195.26556 MSE: 0.461621 Root MSE: 0.679427 
## Multiple R-Squared: 0.125654 Adjusted R-Squared: 0.117385

As we can see, that all else held constant, the labor supply slopes upward: the coefficient for wage is positive. We can compute labor supply elasticity. Elasticity measures the percentage change in y (hours) due to percentage change in x (wages).

The estimated coefficient for $log(wage)$, $\alpha_1$ in the labor supply function (function that determines the hours), is approximately 1640. \[ \Delta \widehat{hours} \approx \frac{1640}{100} (\% \Delta wage) \] Multiplying both sides by $100/hours$ yields the following. \[ 100 \frac{\Delta \widehat{hours}}{hours} \approx \frac{100}{hours} \frac{1640}{100} (\% \Delta wage) \] \[ \% \Delta \widehat{hours} \approx \frac{1640}{hours} (\% \Delta wage) \] At the average hours worked ($1,303$), the estimated elasticity is $1,640/ 1,303 \approx 1.26$ which is relatively large.

Looking at the wage equation, we now find that hours worked is not significant (what you find running a multiple regresion model estimtaed by OLS) so the endogeneity issue was addressed.

See the results of equations estimated individually by OLS by running the code below.

data(mroz, package='wooldridge')
eq.hrs   = hours    ~ log(wage)+educ+age+kidslt6+nwifeinc
eq.wage  = log(wage)~ hours    +educ+exper+I(exper^2)
summary(lm(eq.hrs,data=subset(mroz,!is.na(wage))))
summary(lm(eq.wage,data=subset(mroz,!is.na(wage))))

David Romer (1993) proposed that more open economies should have lower rates of inflation, all else the same. The model he suggests can be stated as a symultaneous equations model: \[ inf=\beta_{10}+\alpha_{1}open+\beta_{11}log(pcinc)+u_1 \] \[ open=\beta_{20}+\alpha_{2}+\beta_{21}+\beta_{22}log(land)+u_2 \] Here, $pcinc$ is per capita income, $open$ is the share of imports relative to GDP and $land$ is the area of a country. Runing the code below, we find that open economies have lower rates of inflation. The result is significant at 1% level.

data(openness, package='wooldridge')

#A system of equations
eq.1   = open~inf+log(pcinc)+log(land)
eq.2  = inf~open+log(pcinc)
eq.system= list(eq.1, eq.2)
instrum  = ~log(land)+log(pcinc)
summary(systemfit(eq.system,inst=instrum,
                  data=openness,
                  method="2SLS"))

## 
## systemfit results 
## method: 2SLS 
## 
##          N  DF    SSR detRCov   OLS-R2 McElroy-R2
## system 228 221 979617 15553.7 -6.60387   0.994595
## 
##       N  DF      SSR      MSE    RMSE         R2     Adj R2
## eq1 114 110 916552.8 8332.298 91.2814 -13.375497 -13.767556
## eq2 114 111  63064.2  568.146 23.8358   0.030876   0.013415
## 
## The covariance matrix of the residuals
##         eq1      eq2
## eq1 8332.30 2172.190
## eq2 2172.19  568.146
## 
## The correlations of the residuals
##          eq1      eq2
## eq1 1.000000 0.998356
## eq2 0.998356 1.000000
## 
## 
## 2SLS estimates for 'eq1' (equation 1)
## Model Formula: open ~ inf + log(pcinc) + log(land)
## Instruments: ~log(land) + log(pcinc)
## 
##                 Estimate   Std. Error t value Pr(>|t|)
## (Intercept)  6.86167e+01  1.40977e+08       0        1
## inf         -3.84203e+00  1.11752e+07       0        1
## log(pcinc)   1.28182e+00  2.13887e+06       0        1
## log(land)    2.24466e+00  2.85392e+07       0        1
## 
## Residual standard error: 91.281424 on 110 degrees of freedom
## Number of observations: 114 Degrees of Freedom: 110 
## SSR: 916552.812952 MSE: 8332.2983 Root MSE: 91.281424 
## Multiple R-Squared: -13.375497 Adjusted R-Squared: -13.767556 
## 
## 
## 2SLS estimates for 'eq2' (equation 2)
## Model Formula: inf ~ open + log(pcinc)
## Instruments: ~log(land) + log(pcinc)
## 
##              Estimate Std. Error  t value Pr(>|t|)  
## (Intercept) 26.899337  15.401196  1.74657 0.083477 .
## open        -0.337487   0.144121 -2.34169 0.020981 *
## log(pcinc)   0.375825   2.015081  0.18651 0.852388  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 23.835811 on 111 degrees of freedom
## Number of observations: 114 Degrees of Freedom: 111 
## SSR: 63064.19425 MSE: 568.145894 Root MSE: 23.835811 
## Multiple R-Squared: 0.030876 Adjusted R-Squared: 0.013415

#Alternatively, can run as a standard IV model
#rfreg= lm(open~log(pcinc)+log(land),data=openness) #First stage
#ssreg= lm(inf~rfreg$fitted.values+log(pcinc),data=openness) #Second stage

Permanent Income Hypothesis is a hypothesis about consumer spending. It states that people will spend consistent not with their current income but with their expected long-term (or permanent) income. The main equation can be stated as \[ gc_t = \beta_0 + \beta_1 gy_t + \beta_2 r3_t +u_t \] Here, $ gc_t$ is annual growth in real per capita consumption, measured as change in log of consumption, $gy_t$ is growth in real disposable income, $r3_t$ is the ex post real interest rate as measured by the return on three-month T-bill rates, or $r3_t = i3_t - inf_t$. Traditionally, we think of incomes and interest rates as jointly determined. The full model can be stated as 3 equation system. \[ gc_t = \beta_{10} + \beta_{11} gy_t + \beta_{12} r3_t +u_t \] \[ gy_t = \beta_{20} + \beta_{21} gc_{t-1} + \beta_{22} gy_{t-1} + \beta_{23} r3_{t-1} \] \[ r_t = \beta_{30} + \beta_{31} gc_{t-1} + \beta_{32} gy_{t-1} + \beta_{33} r3_{t-1} \]

data(consump, package='wooldridge')

#A system of equations
eq.1   = gy~gc_1+gy_1+r3_1
eq.2   = r3~gc_1+gy_1+r3_1
eq.3  = gc~gy+r3
eq.system= list(eq.1, eq.2, eq.3)
instrum  = ~gc_1+gy_1+r3_1
summary(systemfit(eq.system,inst=instrum,
                  data=consump,
                  method="2SLS"))

## 
## systemfit results 
## method: 2SLS 
## 
##          N DF     SSR detRCov   OLS-R2 McElroy-R2
## system 105 94 52.8161       0 0.651848    0.61607
## 
##      N DF       SSR      MSE     RMSE       R2   Adj R2
## eq1 35 31  0.008233 0.000266 0.016297 0.279087 0.209321
## eq2 35 31 52.806075 1.703422 1.305152 0.651875 0.618186
## eq3 35 32  0.001786 0.000056 0.007471 0.677905 0.657774
## 
## The covariance matrix of the residuals
##              eq1         eq2          eq3
## eq1  2.65584e-04 0.009452045 -1.30518e-06
## eq2  9.45205e-03 1.703421775  1.71743e-04
## eq3 -1.30518e-06 0.000171743  5.58191e-05
## 
## The correlations of the residuals
##            eq1       eq2        eq3
## eq1  1.0000000 0.4443897 -0.0107196
## eq2  0.4443897 1.0000000  0.0176127
## eq3 -0.0107196 0.0176127  1.0000000
## 
## 
## 2SLS estimates for 'eq1' (equation 1)
## Model Formula: gy ~ gc_1 + gy_1 + r3_1
## Instruments: ~gc_1 + gy_1 + r3_1
## 
##                 Estimate   Std. Error  t value  Pr(>|t|)   
## (Intercept)  0.006744291  0.005473074  1.23227 0.2271140   
## gc_1         1.234539605  0.395506414  3.12141 0.0038783 **
## gy_1        -0.522556240  0.278144227 -1.87872 0.0697142 . 
## r3_1         0.000846557  0.001394895  0.60690 0.5483388   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.016297 on 31 degrees of freedom
## Number of observations: 35 Degrees of Freedom: 31 
## SSR: 0.008233 MSE: 0.000266 Root MSE: 0.016297 
## Multiple R-Squared: 0.279087 Adjusted R-Squared: 0.209321 
## 
## 
## 2SLS estimates for 'eq2' (equation 2)
## Model Formula: r3 ~ gc_1 + gy_1 + r3_1
## Instruments: ~gc_1 + gy_1 + r3_1
## 
##               Estimate Std. Error  t value Pr(>|t|)    
## (Intercept)   0.832247   0.438320  1.89872 0.066944 .  
## gc_1         16.445541  31.674802  0.51920 0.607309    
## gy_1        -42.900522  22.275653 -1.92589 0.063332 .  
## r3_1          0.849559   0.111712  7.60487 1.42e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.305152 on 31 degrees of freedom
## Number of observations: 35 Degrees of Freedom: 31 
## SSR: 52.806075 MSE: 1.703422 Root MSE: 1.305152 
## Multiple R-Squared: 0.651875 Adjusted R-Squared: 0.618186 
## 
## 
## 2SLS estimates for 'eq3' (equation 3)
## Model Formula: gc ~ gy + r3
## Instruments: ~gc_1 + gy_1 + r3_1
## 
##                 Estimate   Std. Error  t value   Pr(>|t|)    
## (Intercept)  0.008059689  0.003232742  2.49314 0.01802571 *  
## gy           0.586188030  0.134573716  4.35589 0.00012763 ***
## r3          -0.000269401  0.000764035 -0.35260 0.72669802    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.007471 on 32 degrees of freedom
## Number of observations: 35 Degrees of Freedom: 32 
## SSR: 0.001786 MSE: 5.6e-05 Root MSE: 0.007471 
## Multiple R-Squared: 0.677905 Adjusted R-Squared: 0.657774

Since we find that the the estimated coefficient for $gy_t$ on $gc_t$ equation is positive and statistically significant, we must reject the permanent income hypothesis. Higher current disposable income does increase consumtpion.

Homework Problems

Computer Exercise C1.
Use data set smoke from package wooldridge for this exercise.
1. A model to estimate the effects of smoking on annual income (perhaps through lost work days due to illness, or productivity effects) is \[log(income) = \beta_0 + \beta_1*cigs + \beta_2*educ +\beta_3*age+\beta_4*age^2 + u.\] where cigs is number of cigarettes smoked per day, on average. How do you interpret $\beta_1$?
2. To reflect the fact that cigarette consumption might be jointly determined with income, a demand for cigarettes equation is \[ cigs = \gamma_0 + \gamma_1* *log(income) + \gamma_2*educ + \gamma_3*age + \gamma_4*age^2 + \gamma_5*log(cigpric) + \gamma_6*restaurn + u_2 \] where cigpric is the price of a pack of cigarettes (in cents) and restaurn is a binary variable equal to unity if the person lives in a state with restaurant smoking restrictions. Assuming these are exogenous to the individual, what signs would you expect for $\gamma_5$ and $\gamma_6$?
3. Under what assumption is the income equation from part 1 identified?
4. Estimate the income equation by OLS and discuss the estimate of $\beta_1$.
5. Estimate the reduced form for cigs. (Recall that this entails regressing cigs on all exogenous variables.) Are log(cigpric) and restaurn significant in the reduced form?
6. Now, estimate the income equation by 2SLS. Discuss how the estimate of _1 compares with the OLS estimate.
7. Do you think that cigarette prices and restaurant smoking restrictions are exogenous in the income equation?

References

Wooldridge, J. (2019). Introductory econometrics: a modern approach. Boston, MA: Cengage.

Heiss, F. (2016). Using R for introductory econometrics. Düsseldorf: Florian Heiss, CreateSpace.