Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Forecasting
- /
- Re: Heckman's correction and GLM

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 02-19-2016 06:01 PM
(2536 views)

Hi,

I am trying to estimate a GLM within a 2 step Heckman's correction method. I have looked at the reference materials. The SAS documentation shows (http://support.sas.com/documentation/cdl/en/etsug/67525/HTML/default/viewer.htm#etsug_qlim_examples0... that the selection model and the response models are estimated together. It also shows some other model types you can specifiy in this framework, but I dont believe GLM can be modeled same way, with the seletion probability model.

So, my question - is applying two step would be a right approach ? That is estimating the selection probability mdoel, and then calculating inverse mills ration and then using it to the GLM model specification.

Thanks for any help you could provide.

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi,

First of all, I would like to state what I understand from your problem:

Your selection model consists of two models. You have a probit selection equation that defines your selection “rule” and a model that you are actually interested in estimating (the response model). In your case, the response model is a GLM, i.e., the response variable distribution is a member of the exponential family, which includes the normal, Poisson, binomial, exponential, and gamma distributions.

If your response model is linear, which is a special case of the GLM, then all you need to do is to use the HECKIT option of the PROC QLIM. The HECKIT option requests that the selection model be estimated by Heckman’s two-step estimation method as it is defined in his 1979 paper (for details http://support.sas.com/documentation/cdl/en/etsug/67525/HTML/default/viewer.htm#etsug_qlim_details17...) . Using the example that you pointed out this can be done with the SAS program as

/*-- Sample Selection --*/

proc qlim data=mroz heckit;

model inlf = nwifeinc educ exper expersq

age kidslt6 kidsge6 /discrete;

model lwage = educ exper expersq / select(inlf=1);

run;

If your response model is nonlinear, for example if you have a binary response model or exponential response model, then, most likely, applying this particular selection bias correction method by estimating the selection equation by probit and then plugging the estimated inverse Mills ratio into the second-stage estimation method using only the selected sample will NOT be valid. In this case, you need to figure out the nature of the bias based on the particular assumptions of your model and apply the two-step method manually.

However, testing the null hypothesis of no selection bias when you have a binary response model can be done easily. For this, use SECONDSTAGE=ML suboption of the HECKIT option and use the t value on the coefficient on the _y.LAMBDA parameter where y is the dependent variable in your response model. Below is an example

proc qlim data=mroz heckit(secondstage=ML);

model inlf = nwifeinc educ exper expersq

age kidslt6 kidsge6 /discrete;

model lwage = educ exper expersq / discrete select(inlf=1);

run;

I hope this helps,

Best regards,

Gunce

1 REPLY 1

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi,

First of all, I would like to state what I understand from your problem:

Your selection model consists of two models. You have a probit selection equation that defines your selection “rule” and a model that you are actually interested in estimating (the response model). In your case, the response model is a GLM, i.e., the response variable distribution is a member of the exponential family, which includes the normal, Poisson, binomial, exponential, and gamma distributions.

If your response model is linear, which is a special case of the GLM, then all you need to do is to use the HECKIT option of the PROC QLIM. The HECKIT option requests that the selection model be estimated by Heckman’s two-step estimation method as it is defined in his 1979 paper (for details http://support.sas.com/documentation/cdl/en/etsug/67525/HTML/default/viewer.htm#etsug_qlim_details17...) . Using the example that you pointed out this can be done with the SAS program as

/*-- Sample Selection --*/

proc qlim data=mroz heckit;

model inlf = nwifeinc educ exper expersq

age kidslt6 kidsge6 /discrete;

model lwage = educ exper expersq / select(inlf=1);

run;

If your response model is nonlinear, for example if you have a binary response model or exponential response model, then, most likely, applying this particular selection bias correction method by estimating the selection equation by probit and then plugging the estimated inverse Mills ratio into the second-stage estimation method using only the selected sample will NOT be valid. In this case, you need to figure out the nature of the bias based on the particular assumptions of your model and apply the two-step method manually.

However, testing the null hypothesis of no selection bias when you have a binary response model can be done easily. For this, use SECONDSTAGE=ML suboption of the HECKIT option and use the t value on the coefficient on the _y.LAMBDA parameter where y is the dependent variable in your response model. Below is an example

proc qlim data=mroz heckit(secondstage=ML);

model inlf = nwifeinc educ exper expersq

age kidslt6 kidsge6 /discrete;

model lwage = educ exper expersq / discrete select(inlf=1);

run;

I hope this helps,

Best regards,

Gunce

⏰

Time is running out to save with the early bird rate. Register by Friday, March 1 for just $695 - $100 off the standard rate.

Check out the agenda and get ready for a jam-packed event featuring workshops, super demos, breakout sessions, roundtables, inspiring keynotes and incredible networking events.** **

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.