gunce_sas Tracker
https://communities.sas.com/kntur85557/tracker
gunce_sas TrackerSun, 14 Apr 2024 20:44:30 GMT2024-04-14T20:44:30ZRe: Technical efficiency estimated using a quantile regression
https://communities.sas.com/t5/Statistical-Procedures/Technical-efficiency-estimated-using-a-quantile-regression/m-p/911713#M45259
<P>Hi Tomas,</P>
<P>Yes, quantile regression can be a nice alternative to the DEA and SFA methods as a semiparametric method for estimating production function. However, when it comes to the performance of technical efficiency measure obtained from this method, it may not be as good as those obtained from DEA and SFA. (see <A href="https://www.york.ac.uk/media/economics/documents/herc/wp/07_14.pdf" target="_blank">https://www.york.ac.uk/media/economics/documents/herc/wp/07_14.pdf</A>)</P>
<P>To my knowledge, in SAS, there is no procedure that outputs technical efficiency measure when a production function is estimated by using quantile regression, as in PROC QLIM or PROC FRONTIER. However, the output from PROC QUANTREG or QUANTSELECT can be used to calculate technical efficiency based on its definition, which can be found <A href="https://link.springer.com/content/pdf/10.1007/s41685-022-00228-9.pdf" target="_self">here</A> or in the reference you linked. </P>
<P>Best regards,</P>
<P>Gunce</P>Tue, 16 Jan 2024 22:53:31 GMThttps://communities.sas.com/t5/Statistical-Procedures/Technical-efficiency-estimated-using-a-quantile-regression/m-p/911713#M45259gunce_sas2024-01-16T22:53:31ZModern Econometrics Methods
https://communities.sas.com/t5/SAS-Global-Forum-Proceedings/Modern-Econometrics-Methods/ta-p/741420
<DIV class="conf-presentation">
<DIV class="authors">
<DIV class="author-head" contenteditable="false">Presenter</DIV>
<P>Gunce Walton, SAS</P>
</DIV>
<H2 contenteditable="false">Abstract</H2>
<P>This session overviews recent additions to SAS® Econometrics and demonstrates several examples of the new Frontier procedure for the analysis of stochastic frontier production or cost models, new sequential Monte Carlo methods for nonlinear non-Gaussian state space models, and a new data access engine to retrieve data from the Bureau of Economic Analysis databases.</P>
<H2>Watch the presentation</H2>
<P>Watch <A href="https://youtu.be/kN29qqleQ3U" target="_self">Modern Econometrics Methods</A> on the SAS Users YouTube channel.</P>
<P> </P>
<P><div class="video-embed-center video-embed"><iframe class="embedly-embed" src="https://cdn.embedly.com/widgets/media.html?src=https%3A%2F%2Fwww.youtube.com%2Fembed%2FkN29qqleQ3U%3Ffeature%3Doembed&display_name=YouTube&url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DkN29qqleQ3U&image=https%3A%2F%2Fi.ytimg.com%2Fvi%2FkN29qqleQ3U%2Fhqdefault.jpg&key=b0d40caa4f094c68be7c29880b16f56e&type=text%2Fhtml&schema=youtube" width="600" height="337" scrolling="no" title="Modern Econometrics Methods" frameborder="0" allow="autoplay; fullscreen; encrypted-media; picture-in-picture;" allowfullscreen="true"></iframe></div></P>
</DIV>Thu, 20 May 2021 12:58:21 GMThttps://communities.sas.com/t5/SAS-Global-Forum-Proceedings/Modern-Econometrics-Methods/ta-p/741420gunce_sas2021-05-20T12:58:21ZRe: Endogeneity in linear mixed models
https://communities.sas.com/t5/Statistical-Procedures/Endogeneity-in-linear-mixed-models/m-p/594146#M28996
<P>Dear Cuneyt,</P>
<P> </P>
<P>You are right, PROC QLIM has an endogeneity test, also you can model random effects (both random intercept, as in your case, and random coefficients) for a single SUBJECT value in PROC QLIM using the RANDOM statement. However, if you are using the RANDOM statement you can have only <U>one</U> MODEL statement. Therefore, you cannot model your reduced form equation (<SPAN style="display: inline !important; float: none; background-color: transparent; color: #333333; font-family: 'HelevticaNeue-light','Helvetica Neue',Helvetica,Arial,sans-serif; font-size: 14px; font-style: normal; font-variant: normal; font-weight: 400; letter-spacing: normal; line-height: 21.33px; orphans: 2; text-align: left; text-decoration: none; text-indent: 0px; text-transform: none; -webkit-text-stroke-width: 0px; white-space: normal; word-spacing: 0px;">X1 = c0 + c2 X2 + c3 Z + v + e2</SPAN>) along with your structural equation (<SPAN style="display: inline !important; float: none; background-color: transparent; color: #333333; font-family: 'HelevticaNeue-light','Helvetica Neue',Helvetica,Arial,sans-serif; font-size: 14px; font-style: normal; font-variant: normal; font-weight: 400; letter-spacing: normal; line-height: 21.33px; orphans: 2; text-align: left; text-decoration: none; text-indent: 0px; text-transform: none; -webkit-text-stroke-width: 0px; white-space: normal; word-spacing: 0px;">Y = b0 + b1 X1 + b2 X2 + u + e1</SPAN>) which are both necessary for the endogeneity test. <U>However</U>, you should go ahead and test for endogeneity of X1 in PROC QLIM even if you are not able to model the random effects. Because, if you do have any correlation between X1 and u and/or e1 this will show up in the test implying that you have the problem of endogeneity in your main model. The only thing that you won't be sure of is that you won't know which error component X1 is correlated to, as PROC QLIM will treat them as a single error term (say, v=u+e1). </P>
<P> </P>
<P>For your second question, in either case you do have the problem of endogeneity. Because, the assumptions u|(X1, X2)~N(0, sigma_u^2) or e1|(X1, X2, u)~N(0, sigma_e1^2) will be violated and this implies endogeneity. Therefore, the test for endogeneity done in QLIM will give you an answer.</P>
<P> </P>
<P>For your third question, no, you don't need to worry about the endogeneity because you are already correcting for it by modeling both the structural and the reduced form equations together (as you would do in PROC QLIM). The only thing is that, you won't be modelling the random effect. The last statement answers your last question.</P>
<P> </P>
<P>I hope this helps,</P>
<P> </P>
<P>Best regards,</P>
<P>Gunce</P>
<P> </P>
<P> </P>
<P> </P>Fri, 04 Oct 2019 16:03:02 GMThttps://communities.sas.com/t5/Statistical-Procedures/Endogeneity-in-linear-mixed-models/m-p/594146#M28996gunce_sas2019-10-04T16:03:02ZRe: How to interpret the marginal effect of a log transformed independent variable in proc qlim?
https://communities.sas.com/t5/SAS-Forecasting-and-Econometrics/How-to-interpret-the-marginal-effect-of-a-log-transformed/m-p/576616#M3595
<P style="margin: 0px 0px 13.33px;"><FONT color="#000000" face="Calibri" size="3">Hello,</FONT></P>
<P style="margin: 0px 0px 13.33px;"><FONT color="#000000" face="Calibri" size="3"> </FONT><FONT color="#000000" face="Calibri" size="3">For a logit model, the marginal effect of change in a regressor, say jth regressor for observation i, on the conditional probability that y_i=1 is </FONT></P>
<P style="margin: 0px 0px 13.33px;"><FONT color="#000000" face="Calibri" size="3">G(x_i'b)[1 - G(x_i'b)]b_j</FONT></P>
<P style="margin: 0px 0px 13.33px;"><FONT color="#000000" face="Calibri" size="3">where, G(x’b) = exp(x’b)/[1+exp(x’b)]. </FONT></P>
<P style="margin: 0px 0px 13.33px;"><FONT color="#000000" face="Calibri" size="3">You can request these marginal effects for each regressor from PROC QLIM by specifying the OUTPUT statement and its MARGINAL option. For example, </FONT><FONT color="#000000" face="Calibri" size="3"> </FONT></P>
<P style="margin: 0px 0px 13.33px;"><FONT color="#000000" face="Calibri" size="3">OUTPUT OUT=myoutputdata MARGINAL;</FONT><FONT color="#000000" face="Calibri" size="3"> </FONT></P>
<P style="margin: 0px 0px 13.33px;"><FONT color="#000000" face="Calibri" size="3">The average marginal effect is the sample average of these marginal effects. For a logit model, this is</FONT><FONT color="#000000" face="Calibri" size="3"> </FONT></P>
<P style="margin: 0px 0px 13.33px;"><FONT color="#000000" face="Calibri" size="3">(1/N)*SUMOVER_i{ G(x_i'b)[1 - G(x_i'b)] }b_j</FONT><FONT color="#000000" face="Calibri" size="3"> </FONT></P>
<P style="margin: 0px 0px 13.33px;"><FONT color="#000000" face="Calibri" size="3">You can obtain this by averaging the column for the marginal effect of the log of annual earnings over the observations in the data set you specify with the OUT option (myoutputdata in the example above).</FONT></P>
<P style="margin: 0px 0px 13.33px;"><FONT color="#000000" face="Calibri" size="3">If you calculated the average marginal effect of the log of annual earnings as described above and obtained the value -0.0204, then the interpretation of this is that on average a 1 percent increase in log earnings reduces the probability of the event occurring by 2.04 percent.</FONT></P>
<P style="margin: 0px 0px 13.33px;"><FONT color="#000000" face="Calibri" size="3">I hope this helps,</FONT></P>
<P style="margin: 0px 0px 13.33px;"><FONT color="#000000" face="Calibri" size="3">Gunce</FONT></P>Thu, 25 Jul 2019 14:43:08 GMThttps://communities.sas.com/t5/SAS-Forecasting-and-Econometrics/How-to-interpret-the-marginal-effect-of-a-log-transformed/m-p/576616#M3595gunce_sas2019-07-25T14:43:08ZRe: Proc qlim probit model estimate producing unexpected results
https://communities.sas.com/t5/SAS-Forecasting-and-Econometrics/Proc-qlim-probit-model-estimate-producing-unexpected-results/m-p/458170#M3106
<P>Hello,</P>
<P> </P>
<P>Please check out the link below to obtain the details about the binary probit model that QLIM estimates.</P>
<P> </P>
<P><A href="http://go.documentation.sas.com/?docsetId=etsug&docsetTarget=etsug_qlim_details01.htm&docsetVersion=14.3&locale=en" target="_blank">http://go.documentation.sas.com/?docsetId=etsug&docsetTarget=etsug_qlim_details01.htm&docsetVersion=14.3&locale=en</A></P>
<P> </P>
<P>As you see there, the probit model has an underlying latent model</P>
<P> </P>
<P>y* = x’beta + epsilon</P>
<P> </P>
<P>where y* cannot be observed but we can observe y and the relationship between y and y* is based on</P>
<P> </P>
<P>y = 1 if y* >0</P>
<P>y = 0 otherwise</P>
<P> </P>
<P>When you say “the PROBIT regression coefficients” I understand that you are referring to beta in this model. There are no particular restrictions on beta as they are the regression coefficients from an unobserved linear model.</P>
<P> </P>
<P>When it comes to the second part of your question, it is true that the keywords like the model options change into the blue color to indicate that they are keywords but if this does not work do not worry. As long as you are not getting an error in the log window, you are not making a mistake for that option. </P>
<P> </P>
<P>I hope this helps,</P>
<P>Gunce</P>Fri, 27 Apr 2018 15:34:35 GMThttps://communities.sas.com/t5/SAS-Forecasting-and-Econometrics/Proc-qlim-probit-model-estimate-producing-unexpected-results/m-p/458170#M3106gunce_sas2018-04-27T15:34:35ZRe: Ordered Logit with Endogeneity
https://communities.sas.com/t5/SAS-Forecasting-and-Econometrics/Ordered-Logit-with-Endogeneity/m-p/395148#M2661
<P>Hello,</P>
<P> </P>
<P>The ENDOGENOUS statement in PROC QLIM is for specifying the type of dependent variables that appear on the left-hand side of the equation. In other words, endogenous variables listed after each ENDOGENOUS statement refer to the dependent variables that appear on the left-hand side of the equation. You can also use the options of the MODEL statement to achieve the same result. For example, in the SAS code you wrote, you may as well accomplish the same results by the commands below:</P>
<P> </P>
<P>proc qlim data = mydata itprint;</P>
<P>title "enrolled logit regression model results";</P>
<P>class race gender;</P>
<P>model curr_enrolled = age race gender / discrete (distribution = logit);</P>
<P>output out =logit marginal proball;</P>
<P>run;</P>
<P> </P>
<P>Note that if you have the endogeneity problem in your model, that is, if one or more of your regressors are correlated with your model errors, the ENDOGENOUS statement is <U>not</U> how you can take into account this problem in your estimations. In the case of endogeneity, you should estimate your model of interest (the structural equation) and the reduced form equation(s) simultaneously. Of course, to be able to do this, you need to find proper instruments to form your reduced form model(s). To see if you have endogeneity of not, you can use the ENDOTEST option of the MODEL statement. For more information please see the Endogeneity and Instrumental Variables subsection of the PROC QLIM documentation.</P>
<P>(<A href="http://go.documentation.sas.com/?docsetId=etsug&docsetVersion=14.2&docsetTarget=etsug_qlim_details24.htm&locale=en">http://go.documentation.sas.com/?docsetId=etsug&docsetVersion=14.2&docsetTarget=etsug_qlim_details24.htm&locale=en</A>)</P>
<P> </P>
<P>Best regards,</P>
<P>Gunce</P>Tue, 12 Sep 2017 15:34:23 GMThttps://communities.sas.com/t5/SAS-Forecasting-and-Econometrics/Ordered-Logit-with-Endogeneity/m-p/395148#M2661gunce_sas2017-09-12T15:34:23ZRe: Serial Autocorrelation Tobit
https://communities.sas.com/t5/SAS-Forecasting-and-Econometrics/Serial-Autocorrelation-Tobit/m-p/389222#M2609
<P>Hi,</P>
<P>As I mentioned before, you can use PROC NLMIXED and specify the nature of the standard errors.</P>
<P>Best regards,</P>
<P>Gunce</P>Fri, 18 Aug 2017 20:14:46 GMThttps://communities.sas.com/t5/SAS-Forecasting-and-Econometrics/Serial-Autocorrelation-Tobit/m-p/389222#M2609gunce_sas2017-08-18T20:14:46ZRe: Serial Autocorrelation Tobit
https://communities.sas.com/t5/SAS-Forecasting-and-Econometrics/Serial-Autocorrelation-Tobit/m-p/389114#M2607
<P>Hi,</P>
<P> </P>
<P>You must have an older version of SAS/ETS. PROC QLIM started supporting the RANDOM statement starting from version 14.1.</P>
<P> </P>
<P>Gunce </P>Fri, 18 Aug 2017 14:03:23 GMThttps://communities.sas.com/t5/SAS-Forecasting-and-Econometrics/Serial-Autocorrelation-Tobit/m-p/389114#M2607gunce_sas2017-08-18T14:03:23ZRe: Serial Autocorrelation Tobit
https://communities.sas.com/t5/SAS-Forecasting-and-Econometrics/Serial-Autocorrelation-Tobit/m-p/388837#M2602
<P>Hi,</P>
<P> </P>
<P>You can estimate random-effects tobit models in PROC QLIM using the new RANDOM statement. A simple example, based on your model, for the syntax would be</P>
<P>PROC QLIM;</P>
<P>MODEL Y = X / censored(lb=0);</P>
<P>RANDOM INT / SUBJECT=id METHOD=HERMITE(QPOINTS=12);</P>
<P>RUN;</P>
<P> </P>
<P>Currently, PROC QLIM does not offer an option for obtaining robust standard errors for heteroskedasticity and serial correlation.</P>
<P> </P>
<P>Please note that xttobit does not have the “robust” option, either.</P>
<P> </P>
<P>If you are willing to specify the nature of the standard errors, then you might estimate your model with PROC NLMIXED by modeling the standard errors explicitly.</P>
<P> </P>
<P>Best regards,</P>
<P>Gunce</P>Thu, 17 Aug 2017 14:59:46 GMThttps://communities.sas.com/t5/SAS-Forecasting-and-Econometrics/Serial-Autocorrelation-Tobit/m-p/388837#M2602gunce_sas2017-08-17T14:59:46ZRe: Heckman's correction and GLM
https://communities.sas.com/t5/SAS-Forecasting-and-Econometrics/Heckman-s-correction-and-GLM/m-p/251867#M1629
<P>Hi,</P>
<P>First of all, I would like to state what I understand from your problem:</P>
<P>Your selection model consists of two models. You have a probit selection equation that defines your selection “rule” and a model that you are actually interested in estimating (the response model). In your case, the response model is a GLM, i.e., the response variable distribution is a member of the exponential family, which includes the normal, Poisson, binomial, exponential, and gamma distributions.</P>
<P>If your response model is linear, which is a special case of the GLM, then all you need to do is to use the HECKIT option of the PROC QLIM. The HECKIT option requests that the selection model be estimated by Heckman’s two-step estimation method as it is defined in his 1979 paper (for details <A href="http://support.sas.com/documentation/cdl/en/etsug/67525/HTML/default/viewer.htm#etsug_qlim_details17.htm" target="_blank">http://support.sas.com/documentation/cdl/en/etsug/67525/HTML/default/viewer.htm#etsug_qlim_details17.htm</A>) . Using the example that you pointed out this can be done with the SAS program as</P>
<P> </P>
<P>/*-- Sample Selection --*/</P>
<P>proc qlim data=mroz heckit;</P>
<P> model inlf = nwifeinc educ exper expersq</P>
<P> age kidslt6 kidsge6 /discrete;</P>
<P> model lwage = educ exper expersq / select(inlf=1);</P>
<P>run;</P>
<P> </P>
<P>If your response model is nonlinear, for example if you have a binary response model or exponential response model, then, most likely, applying this particular selection bias correction method by estimating the selection equation by probit and then plugging the estimated inverse Mills ratio into the second-stage estimation method using only the selected sample will NOT be valid. In this case, you need to figure out the nature of the bias based on the particular assumptions of your model and apply the two-step method manually.</P>
<P> </P>
<P>However, testing the null hypothesis of no selection bias when you have a binary response model can be done easily. For this, use SECONDSTAGE=ML suboption of the HECKIT option and use the t value on the coefficient on the _y.LAMBDA parameter where y is the dependent variable in your response model. Below is an example</P>
<P> </P>
<P>proc qlim data=mroz heckit(secondstage=ML);</P>
<P> model inlf = nwifeinc educ exper expersq</P>
<P> age kidslt6 kidsge6 /discrete;</P>
<P> model lwage = educ exper expersq / discrete select(inlf=1);</P>
<P>run;</P>
<P> </P>
<P>I hope this helps,</P>
<P>Best regards,</P>
<P>Gunce</P>Tue, 23 Feb 2016 20:57:49 GMThttps://communities.sas.com/t5/SAS-Forecasting-and-Econometrics/Heckman-s-correction-and-GLM/m-p/251867#M1629gunce_sas2016-02-23T20:57:49ZRe: 2-stage Heckman (1979) procedure
https://communities.sas.com/t5/SAS-Forecasting-and-Econometrics/2-stage-Heckman-1979-procedure/m-p/134439#M797
<HTML><HEAD></HEAD><BODY><P>It looks correct but without knowing exact details of your models I can't be so sure.</P><P>Actually, what you are trying can be achieved by estimating a selection model in PROC QLIM with the HECKIT option on. You need to have two MODEL statements, one specifying the first model that you estimated (using the DISCRETE option) and the other one specifying the second model (the model with the continuous dependent variable) using the SELECT option.</P></BODY></HTML>Fri, 03 Apr 2015 16:05:55 GMThttps://communities.sas.com/t5/SAS-Forecasting-and-Econometrics/2-stage-Heckman-1979-procedure/m-p/134439#M797gunce_sas2015-04-03T16:05:55ZRe: What is SAS equivalent of ivprovit in STATA?
https://communities.sas.com/t5/SAS-Forecasting-and-Econometrics/What-is-SAS-equivalent-of-ivprovit-in-STATA/m-p/196970#M1229
<HTML><HEAD></HEAD><BODY><P>Hi Elizabeth, </P><P>If you want to use a probit model you should specify the DIST= option as NORMAL (not LOGIT ) or just leave this option out as the default is DIST=NORMAL.</P><P>Other than that your code looks correct.</P><P></P><P>If you would like to test if p1 and p2 are in fact exogenous, you can modify your code as </P><P>proc qlim data=a; </P><P> model Y = p1 p2 p3 / discrete ENDOTEST(p1 p2); </P><P> model p1 p2 = p3 z1 z2 z3; </P><P> run;</P><P> </P><P>If you’d like to test the validity of p1 and p2, you can replace the ENDOTEST(p1 p2) with OVERID(p1.z3 p2.z3) in the above code. The choice of which overidentifying instrument to put in the test should not change your result.</P><P></P><P>About your last point, recall that Stock-Yogo‘s recommendation of using the F statistic to test the strength of the instruments is based on a linear model. Your model is nonlinear; therefore, I am not sure if you can apply an F test (or an equivalent test) to your model to test the strength of the instruments in a straight forward way.</P><P>As Ken mentioned, there is no weak instrument test for your model in SAS.</P><P></P><P>I hope this helps, </P><P>Gunce</P></BODY></HTML>Wed, 18 Mar 2015 19:15:19 GMThttps://communities.sas.com/t5/SAS-Forecasting-and-Econometrics/What-is-SAS-equivalent-of-ivprovit-in-STATA/m-p/196970#M1229gunce_sas2015-03-18T19:15:19ZRe: 2-stage Heckman (1979) procedure
https://communities.sas.com/t5/SAS-Forecasting-and-Econometrics/2-stage-Heckman-1979-procedure/m-p/134435#M793
<HTML><HEAD></HEAD><BODY><P>Hi Elizabeth, </P><P></P><P>I overlooked the fact that one cannot have multiple equations when (DIST=LOGISTIC) option for a model is specified. Since you have two equations you cannot have that specification. Nevertheless, using a probit model instead of a logit shouldn’t change the results that much, those are very similar models. The warning about Hessian being singular can be due to collinearity<BR />or general identification problem. Without seeing the data set I cannot say much about this problem.</P><P></P><P>The _Rho parameter is important. It is the correlation coefficient between the errors of the two models. It tells you if you actually have the selection bias in your sample or not. An insignificant _Rho usually implies that you don’t have a selection bias problem in your model of interest or it can imply that your choice of model is not correct.</P><P></P><P>Standard error correction is necessary when one is using a two-step procedure. If you don’t specify the HECKIT option, then the estimation is done in one step and in that case no correction is needed.</P><P></P><P>You can account for heteroscedasticity using the HETERO statement.</P><P></P><P>I hope these help.</P><P> </P><P>Best,</P><P> Gunce</P></BODY></HTML>Thu, 18 Sep 2014 19:31:24 GMThttps://communities.sas.com/t5/SAS-Forecasting-and-Econometrics/2-stage-Heckman-1979-procedure/m-p/134435#M793gunce_sas2014-09-18T19:31:24ZRe: 2-stage Heckman (1979) procedure
https://communities.sas.com/t5/SAS-Forecasting-and-Econometrics/2-stage-Heckman-1979-procedure/m-p/134433#M791
<HTML><HEAD></HEAD><BODY><P>Hi Elizabeth,</P><P></P><P>I have a few comments about your second post.</P><P></P><P>You are correct that if you are using the HECKIT option of<BR />PROC QLIM then the second stage dependent variable has to be continuous in<BR />nature. However, you can still <SPAN style="text-decoration: underline;">consistently</SPAN> estimate your model using<BR />PROC QLIM even if you have a binary dependent variable for that model. The SAS<BR />code below estimates your selection model consistently:</P><P></P><P> PROC QLIM DATA=test ; </P><P>/* the selection equation--probit */ </P><P>MODEL HighAcq_dum = X1*X2 X1 X2 X3 X4 CatVar </P><P> Year1993 Year1994 Year1995 Year1996 Year1997 / DISCRETE; </P><P>/* the equation of interest */ </P><P>MODEL completed = X1*X2 X1 X2 X3 X4 Year1993 Year1994 Year1995 </P><P> Year1996 Year1997 / SELECT(HighAcq_dum=1) DISCRETE(DIST=LOGISTIC); </P><P>RUN;</P><P> </P><P>Note that the HECKIT option is not on. This way, the two models<BR />are estimated simultaneously and the endogeneity problem that occurs due to the<BR />selected sample is taken into account. This is a one-step method and if your<BR />model is correct, it’s more efficient than its two-step correspondences. </P><P></P><P>Now, about your first question, I am not sure if your<BR />two-step method for the model that you are interested in estimating would<BR />produce consistent estimates. In the first step you are estimating the probit<BR />model to calculate the inverse Mills ratio and using it to correct for the bias<BR />in the second stage for your logit model. However, note that, Heckman, in his<BR />1979 article, drives that bias correction, namely the inverse Mills ratio, for<BR />a linear model of interest, i.e., a model with continuous dependent variable,<BR />under some particular distributional assumptions. In other words, the nature of<BR />the bias may depend on the nature of the dependent variable and the<BR />distributional assumptions of the model of interest, and if so, then you are<BR />not correcting for that bias by including the inverse Mills ratio, the bias may<BR />be something different. The two-step method you explained above may cause you to<BR />have inconsistent estimates. </P></BODY></HTML>Wed, 10 Sep 2014 17:41:34 GMThttps://communities.sas.com/t5/SAS-Forecasting-and-Econometrics/2-stage-Heckman-1979-procedure/m-p/134433#M791gunce_sas2014-09-10T17:41:34ZRe: Logistic Regression with Instrumental Variable
https://communities.sas.com/t5/Statistical-Procedures/Logistic-Regression-with-Instrumental-Variable/m-p/144084#M7552
<HTML><HEAD></HEAD><BODY><P>Hi Niam,</P><P>Your understanding is correct, you should obtain the residuals for each reduced form model (as many as the number of endogenous explanatory variables) -- this makes up your first step-- and then insert them for the error term of the structural model and estimate -- this is the second step. A simple test on the coefficients of these residuals will give you a test of endogeneity.</P><P>You also asked a very good question that I should have explained before. The control function approach is used and valid when the model of interest (the structural model) is nonlinear and the endogenous explanatory variables are all continuous. Let me emphasize this one more time: When you estimate a nonlinear model with endogenous explanatory variables, the nature of the endogenous explanatory variables matters. For control function method to produce a consistent estimator, the corresponding reduced form equations must be linear. In your example this is the case, so you can use either a joint likelihood method, like in the QLIM example I wrote earlier, or a control function method.</P><P> </P><P><SPAN style="line-height: 115%; font-family: 'Calibri','sans-serif'; font-size: 11pt; mso-ascii-theme-font: minor-latin; mso-fareast-font-family: Calibri; mso-fareast-theme-font: minor-latin; mso-hansi-theme-font: minor-latin; mso-bidi-font-family: 'Times New Roman'; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: EN-US; mso-bidi-language: AR-SA;"><BR /></SPAN> </P></BODY></HTML>Thu, 19 Jun 2014 14:12:18 GMThttps://communities.sas.com/t5/Statistical-Procedures/Logistic-Regression-with-Instrumental-Variable/m-p/144084#M7552gunce_sas2014-06-19T14:12:18ZRe: Logistic Regression with Instrumental Variable
https://communities.sas.com/t5/Statistical-Procedures/Logistic-Regression-with-Instrumental-Variable/m-p/144081#M7549
<HTML><HEAD></HEAD><BODY><P>I think, when the dependent variable is a fractional response variable, it should be modeled as truncated rather than censored. Because, with censoring with lower bound 0 and upper bound 1, you are saying that observations that are negative or bigger than 1 actually exist but you are not able to observe them in your sample. With truncation with lower bound 0 and upper bound 1, you are saying that the support of the distribution is [0, 1] and observations can't exist beyond these boundries. Hence,</P><P>Proc Qlim;</P><P>model Y=X / truncated(lb=0 ub=1);</P><P>model X=Z;</P><P>run;</P><P>may fit your data better.</P><P></P><P>When it comes to estimating this model with endogeneity using a two-step method, a control function method (which is also a two-step procedure) works BUT the procedure that you described earlier won't work. When you plug in the estimates of the endogenous variables obtained in the first step and estimate the nonlinear model in the second step will produce an inconsistent estimator. Instead of estimating for the endogenous variables you should estimate the error term of the reduced form model x=z; Here is what I mean:</P><P>X is endogenous if the error term of the structural model, say u, is correlated with that of the reduced form model, say e. We can model this as</P><P>u = theta v + e, where e is independent of v and theta is the correlation coefficient.</P><P>Therefore, you can write the model of interest (the structural model) as</P><P>Y = beta X + theta v + e</P><P>Y is fractional so it's a nonlinear model.</P><P>Now, v is unobserved, so it should be replaced with its estimate. This, you should obtain in the first step and then plug it in in the above model and estimate it appropriately. As far as I know, PROC LOGISTIC doesn't estimate fractional response variables, but I am not so sure, you may want to check on this.</P></BODY></HTML>Wed, 18 Jun 2014 18:54:50 GMThttps://communities.sas.com/t5/Statistical-Procedures/Logistic-Regression-with-Instrumental-Variable/m-p/144081#M7549gunce_sas2014-06-18T18:54:50ZRe: 2-stage regression with more than 2 models
https://communities.sas.com/t5/SAS-Forecasting-and-Econometrics/2-stage-regression-with-more-than-2-models/m-p/105609#M562
<HTML><HEAD></HEAD><BODY><P> </P><P><SPAN style="color: #1f497d;">Hello,</SPAN></P><P><SPAN style="color: #1f497d;">You can solve your problem in couple of ways. </SPAN><SPAN style="color: #1f497d;">One way is that you can use PROC SYSLIN with 2SLS option but write down your problem correctly. I believe the following commands address your problem better.</SPAN></P><P> </P><P style="background: white;"><STRONG style="color: navy; font-size: 10pt; background: white; font-family: 'Courier New';">proc</STRONG> <STRONG style="color: navy; font-size: 10pt; background: white; font-family: 'Courier New';">syslin</STRONG> <SPAN style="background: white; color: blue; font-family: 'Courier New'; font-size: 10pt;">data</SPAN><SPAN style="background: white; color: black; font-family: 'Courier New'; font-size: 10pt;">=Data.data <STRONG style="color: teal; font-size: 10pt; background: white; font-family: 'Courier New';">2</STRONG><SPAN style="background: white; color: black; font-family: 'Courier New'; font-size: 10pt;">sls </SPAN><SPAN style="background: white; color: blue; font-family: 'Courier New'; font-size: 10pt;">first</SPAN><SPAN style="background: white; color: black; font-family: 'Courier New'; font-size: 10pt;">;</SPAN></SPAN> </P><P style="background: white;"><SPAN style="background: white; color: blue; font-family: 'Courier New'; font-size: 10pt;">endogenous </SPAN><SPAN style="background: white; color: black; font-family: 'Courier New'; font-size: 10pt;">log(DepVar1) </SPAN><SPAN lang="EN" style="background: white; color: black; font-family: 'Courier New'; font-size: 10pt; mso-ansi-language: EN;">log(DepVar2) log(DepVar3)</SPAN><SPAN style="background: white; color: black; font-family: 'Courier New'; font-size: 10pt;">;</SPAN> </P><P style="background: white;"><SPAN lang="EN" style="background: white; color: blue; font-family: 'Courier New'; font-size: 10pt; mso-ansi-language: EN;">instruments</SPAN><SPAN lang="EN" style="background: white; color: black; font-family: 'Courier New'; font-size: 10pt; mso-ansi-language: EN;"> log(Exogenous1) log(Exogenous2) Dummy1 Dummy2 Dummy3 Dummy4 InstrumentVar1 InstrumentVar2</SPAN></P><P style="background: white;"><SPAN lang="EN" style="background: white; color: black; font-family: 'Courier New'; font-size: 10pt; mso-ansi-language: EN;"> lag(logDepVar3));</SPAN> </P><P style="background: white;"><SPAN style="background: white; color: blue; font-family: 'Courier New'; font-size: 10pt;">model </SPAN><SPAN style="background: white; color: black; font-family: 'Courier New'; font-size: 10pt;">log(DepVar1) = log(Exogenous1) log(DepVar2) Dummy1 Dummy2 Dummy3 Dummy4 InstrumentVar1;</SPAN> </P><P style="background: white;"><SPAN lang="EN" style="background: white; color: blue; font-family: 'Courier New'; font-size: 10pt; mso-ansi-language: EN;">model</SPAN><SPAN lang="EN" style="background: white; color: black; font-family: 'Courier New'; font-size: 10pt; mso-ansi-language: EN;"> log(DepVar2) = log(Exogenous1) log(DepVar1) log(DepVar3) Dummy1 InstrumentVar1 InstrumentVar2;</SPAN> </P><P style="background: white;"><SPAN lang="EN" style="background: white; color: blue; font-family: 'Courier New'; font-size: 10pt; mso-ansi-language: EN;">model</SPAN><SPAN lang="EN" style="background: white; color: black; font-family: 'Courier New'; font-size: 10pt; mso-ansi-language: EN;"> log(DepVar3) = lag(log(DepVar3)) log(Exogenous 1) ;</SPAN> </P><P style="background: white;"><SPAN style="background: white; color: blue; font-family: 'Courier New'; font-size: 10pt;">restrict </SPAN><SPAN lang="EN" style="background: white; color: black; font-family: 'Courier New'; font-size: 10pt; mso-ansi-language: EN;">log(DepVar1) = -log(DepVar3);</SPAN> </P><P style="background: white;"><STRONG><SPAN lang="EN" style="background: white; color: navy; font-family: 'Courier New'; font-size: 10pt; mso-ansi-language: EN;">run</SPAN></STRONG><SPAN lang="EN" style="background: white; color: black; font-family: 'Courier New'; font-size: 10pt; mso-ansi-language: EN;">;</SPAN></P><P style="background: white;"></P><P style="background: white;"><SPAN style="color: #1f497d; background: white; font-size: 10pt; mso-ansi-language: EN; font-family: 'Courier New';">Note that I used the fact that log(DepVar1/DepVar3) = log(DepVar1) - log(DepVar3).</SPAN></P><P> </P><P><SPAN style="color: #1f497d;">This first method will not take the inter-equation relation into account. To be able to take this correlation into account you should use the same code above but chance 2sls option to 3sls.</SPAN></P><P></P><P><SPAN style="color: #1f497d;">As another method, you can use PROC QLIM:</SPAN> </P><P style="background: white;"><STRONG style="color: navy; font-size: 10pt; background: white; font-family: 'Courier New';">proc</STRONG> <STRONG style="color: navy; font-size: 10pt; background: white; font-family: 'Courier New';">qlim</STRONG> <SPAN style="background: white; color: blue; font-family: 'Courier New'; font-size: 10pt;">data</SPAN><SPAN style="background: white; color: black; font-family: 'Courier New'; font-size: 10pt;">=Data.data;</SPAN> </P><P style="background: white;"><SPAN style="background: white; color: blue; font-family: 'Courier New'; font-size: 10pt;">model </SPAN><SPAN style="background: white; color: black; font-family: 'Courier New'; font-size: 10pt;">log(DepVar1) = log(Exogenous1) log(DepVar2) Dummy1 Dummy2 Dummy3 Dummy4 InstrumentVar;</SPAN></P><P><SPAN lang="EN" style="background: white; color: blue; font-family: 'Courier New'; font-size: 10pt; mso-ansi-language: EN;">model</SPAN><SPAN lang="EN" style="background: white; color: black; font-family: 'Courier New'; font-size: 10pt; mso-ansi-language: EN;"> log(DepVar2) = log(Exogenous1) log(DepVar1) log(DepVar3) Dummy1 InstrumentVar1 InstrumentVar2;</SPAN></P><P><SPAN lang="EN" style="background: white; color: black; font-family: 'Courier New'; font-size: 10pt; mso-ansi-language: EN;"></SPAN><SPAN lang="EN" style="background: white; color: blue; font-family: 'Courier New'; font-size: 10pt; mso-ansi-language: EN;">model</SPAN><SPAN lang="EN" style="background: white; color: black; font-family: 'Courier New'; font-size: 10pt; mso-ansi-language: EN;"> log(DepVar3) = lag(log(DepVar3)) log(Exogenous 1);</SPAN> </P><P><SPAN style="background: white; color: blue; font-family: 'Courier New'; font-size: 10pt;">restrict </SPAN><SPAN lang="EN" style="background: white; color: black; font-family: 'Courier New'; font-size: 10pt; mso-ansi-language: EN;">log(DepVar1) = -log(DepVar3);</SPAN></P><P><STRONG><SPAN lang="EN" style="background: white; color: navy; font-family: 'Courier New'; font-size: 10pt; mso-ansi-language: EN;">run</SPAN></STRONG><SPAN lang="EN" style="background: white; color: black; font-family: 'Courier New'; font-size: 10pt; mso-ansi-language: EN;">;</SPAN></P><P><SPAN lang="EN" style="background: white; color: black; font-family: 'Courier New'; font-size: 10pt; mso-ansi-language: EN;"></SPAN> </P><P style="background: white;"><SPAN lang="EN" style="background: white; color: black; font-family: 'Courier New'; font-size: 10pt; mso-ansi-language: EN;">PROC QLIM will estimate these models jointly and in one step and does this more efficiently using MLE method. But here is the catch: Since there is simultaneity in the equations, the coefficients of the endogenous variables will most likely be inconsistent. If you had only SUR equations problem, then PROC QLIM would have been the best you could do (under the assumption of errors being distributed multivariate normal).</SPAN></P><P></P><P><SPAN lang="EN" style="background: white; color: black; font-family: 'Courier New'; font-size: 10pt; mso-ansi-language: EN;">I hope this helps.</SPAN></P></BODY></HTML>Tue, 12 Feb 2013 18:37:29 GMThttps://communities.sas.com/t5/SAS-Forecasting-and-Econometrics/2-stage-regression-with-more-than-2-models/m-p/105609#M562gunce_sas2013-02-12T18:37:29Z