proc regress data=want;
by country;
class FFI12;
model bdr= fam EBITDA_TA MTB LNTA FA_TA RD_TA stdd;
run;
this is my first stage regression for second stage I want to use another variable std as my dependent variable and use predicted value of bdr from first regression as independent variable. so second model will be like.
by Country; class FFI12;
model std= predict(bdr) fam EBITDA_TA MTB LNTA FA_TA RD_TA stdd
for reference I am attaching a table from a Published Paper.
can anyone guide me how can i do that?
proc regress does not exist in SAS. Assuming you mean to use proc reg and that FFI12 plays no role in your model, you could, in theory try this:
proc reg data=want;
by country;
model bdr = fam EBITDA_TA MTB LNTA FA_TA RD_TA stdd;
output out=pred predicted=predBdr;
run;
proc reg data=pred;
by country;
model std = predBdr fam EBITDA_TA MTB LNTA FA_TA RD_TA stdd;
run;
but this is unlikely to work as you intend, because predBdr will be a linear combination of your other predictors (i.e., it will not add any new information to the model) and proc reg will detect this.
When I use the given codes, I got the following type of results. In first table which shows first stage regression the coefficients are small and in accordance with the previous studies. But The second table which is a second stage regression in which the dependent variable is chaged and predictbdr is used as independent variable(obtained from first regression) the coefficients and standard error values are very big cannot be used and not accordance with previous studies can you guide.
Why are the instrumental variables that you're using to model bdr the same variables you're using to model std?? This is what is causing the multicollineairity and blowing up your standard errors.
To confirm, in the model statement in PROC REG, request variance inflation factors. If the VIF for predBdr is >10, it is the cause (or one of the causes) of the multicollinearity in your model.
model std = predBdr fam EBITDA_TA MTB LNTA FA_TA RD_TA stdd / vif;
Instead of running PROC REG twice, below is another way to run 2SLS using PROC SYSLIN, but you'll still get the same high standard errors because your instrumental variables and model variables are the same.
proc syslin data=dsn 2sls covout first out=syslin_output;
by country;
endogenous bdr;
instruments fam EBITDA_TA MTB LNTA FA_TA RD_TA stdd;
model std= bdr fam EBITDA_TA MTB LNTA FA_TA RD_TA stdd / plot;
output r=residuals;
run; quit;
Why do you have to do this regression in two stages?
One of the previous study use the same regression in their paper. I am trying to apply the same. Just changing one basic variable. i.e. they done study on multinational and domestic firms and I am comparing family and non family firms.
rest of the things are same as i explained before. for reference i am attaching their table of second stage regression here model is same as i explained earlier.
The table indicates that your instrumental variables are fam, EBITDA_TA, MTB, LNTA, FA_TA, RD_TA, and stdd. These are used to get predicted values of bdr.
Your model variables, used in the second stage, are the predicted values of bdr from the first stage regression, along with MNC20_FSALES, MNC50_FSALES, MNC20_FASSETS, and MNC50_FASSETS.
Unlike the model from your orignal post, the instrumental variables in the first stage are not the same as the model variables in the second stage.
The code below will work.
proc syslin data=dsn 2sls covout first out=syslin_output;
by country;
endogenous bdr;
instruments fam EBITDA_TA MTB LNTA FA_TA RD_TA stdd;
model std= bdr MNC20_FSALES MNC50_FSALES MNC20_FASSETS MNC50_FASSETS/ plot;
output r=residuals;
run; quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.