BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
cd2011
Calcite | Level 5

Hi,

 

I am trying to estimate a GLM within a 2 step Heckman's correction method. I have looked at the reference materials. The SAS documentation shows (http://support.sas.com/documentation/cdl/en/etsug/67525/HTML/default/viewer.htm#etsug_qlim_examples0... that the selection model and the response models are estimated together. It also shows some other model types you can specifiy in this framework, but  I dont believe GLM can be modeled same way, with the seletion probability model.

 

So, my question - is applying two step would be a right approach ? That is estimating the selection probability mdoel, and then calculating inverse mills ration and then using it to the GLM model specification. 

 

Thanks for any help you could provide. 

1 ACCEPTED SOLUTION

Accepted Solutions
gunce_sas
SAS Employee

Hi,

First of all, I would like to state what I understand from your problem:

Your selection model consists of two models. You have a probit selection equation that defines your selection “rule” and a model that you are actually interested in estimating (the response model). In your case, the response model is a GLM, i.e., the response variable distribution is a member of the exponential family, which includes the normal, Poisson, binomial, exponential, and gamma distributions.

If your response model is linear, which is a special case of the GLM, then all you need to do is to use the HECKIT option of the PROC QLIM. The HECKIT option requests that the selection model be estimated by Heckman’s two-step estimation method as it is defined in his 1979 paper (for details http://support.sas.com/documentation/cdl/en/etsug/67525/HTML/default/viewer.htm#etsug_qlim_details17...) . Using the example that you pointed out this can be done with the SAS program as

 

/*-- Sample Selection --*/

proc qlim data=mroz heckit;

   model inlf = nwifeinc educ exper expersq

                age kidslt6 kidsge6 /discrete;

   model lwage = educ exper expersq / select(inlf=1);

run;

 

If your response model is nonlinear, for example if you have a binary response model or exponential response model, then, most likely, applying this particular selection bias correction method by estimating the selection equation by probit and then plugging the estimated inverse Mills ratio into the second-stage estimation method using only the selected sample will NOT be valid. In this case, you need to figure out the nature of the bias based on the particular assumptions of your model and apply the two-step method manually.

 

However, testing the null hypothesis of no selection bias when you have a binary response model can be done easily. For this, use SECONDSTAGE=ML suboption of the HECKIT option and use the t value on the coefficient on the _y.LAMBDA parameter where y is the dependent variable in your response model. Below is an example

 

proc qlim data=mroz heckit(secondstage=ML);

   model inlf = nwifeinc educ exper expersq

                age kidslt6 kidsge6 /discrete;

   model lwage = educ exper expersq / discrete select(inlf=1);

run;

 

I hope this helps,

Best regards,

Gunce

View solution in original post

1 REPLY 1
gunce_sas
SAS Employee

Hi,

First of all, I would like to state what I understand from your problem:

Your selection model consists of two models. You have a probit selection equation that defines your selection “rule” and a model that you are actually interested in estimating (the response model). In your case, the response model is a GLM, i.e., the response variable distribution is a member of the exponential family, which includes the normal, Poisson, binomial, exponential, and gamma distributions.

If your response model is linear, which is a special case of the GLM, then all you need to do is to use the HECKIT option of the PROC QLIM. The HECKIT option requests that the selection model be estimated by Heckman’s two-step estimation method as it is defined in his 1979 paper (for details http://support.sas.com/documentation/cdl/en/etsug/67525/HTML/default/viewer.htm#etsug_qlim_details17...) . Using the example that you pointed out this can be done with the SAS program as

 

/*-- Sample Selection --*/

proc qlim data=mroz heckit;

   model inlf = nwifeinc educ exper expersq

                age kidslt6 kidsge6 /discrete;

   model lwage = educ exper expersq / select(inlf=1);

run;

 

If your response model is nonlinear, for example if you have a binary response model or exponential response model, then, most likely, applying this particular selection bias correction method by estimating the selection equation by probit and then plugging the estimated inverse Mills ratio into the second-stage estimation method using only the selected sample will NOT be valid. In this case, you need to figure out the nature of the bias based on the particular assumptions of your model and apply the two-step method manually.

 

However, testing the null hypothesis of no selection bias when you have a binary response model can be done easily. For this, use SECONDSTAGE=ML suboption of the HECKIT option and use the t value on the coefficient on the _y.LAMBDA parameter where y is the dependent variable in your response model. Below is an example

 

proc qlim data=mroz heckit(secondstage=ML);

   model inlf = nwifeinc educ exper expersq

                age kidslt6 kidsge6 /discrete;

   model lwage = educ exper expersq / discrete select(inlf=1);

run;

 

I hope this helps,

Best regards,

Gunce

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 2561 views
  • 0 likes
  • 2 in conversation