I am curring proc survereg with industry and year fixed effect, along with firm clustering. I am trying to estimate something such as "one unit increase in (an independent variable) leads to X% increase/decrease in (a dependent variable)", which is related to marginal effects.
A friend of mine is saying that my statement is used for regression, and not marginal effect to estimate the statement above.
Is this true? If this is true, can I estimate the statement above with surveyreg?
SURVEYREG assumes that the response is normally distributed and this is most definitely not true for a binary response. For a binary response, you should fit an appropriate model such as a logistic model. If your data is survey data, then use PROC SURVEYLOGISTIC. The proportion change in response probability for a unit increase in one of the predictors can be computed as (Px+1-Px)/Px = (Px+1/Px)-1, where Px is the response probability at some level of a predictor, X, and Px+1 is the probability after a unit increase in X. It can be expressed as a percent change by multiplying by 100. This change in probability over a range is not, strictly speaking, a marginal effect. Marginal effects and the change in probability are discussed in detail in this note. The note discusses how marginal effects can be estimated using the Margins macro (which cannot be used with survey data) and how the change (difference) in probabilities can be estimated using the NLMeans macro. The proportion change can also be computed at fixed values of the other predictors in the model using the NLEST macro. See the section of the above note titled "Estimating the difference in probability at specific points". Using the logistic model fit there, the following NLEST macro call estimates the proportion change in response probability from BLAST=0.5 to 1.5 with SMEAR fixed at 0.63.
%nlest(instore=log, f=(logistic(b_p1+1.5*b_p2+.63*b_p3)/logistic(b_p1+0.5*b_p2+.63*b_p3))-1, label=PropChng .5 to 1.5)
The resulting estimate is 1.5085, or as a percentage, 150.85%, indicating about a 150% increase in the response probability from BLAST=0.5 to 1.5 at SMEAR=0.63. A standard error, test of equality to zero, and confidence limits are also provided. While PROC LOGISTIC was used here, the same can be done with a model fit in PROC SURVEYLOGISTIC. An estimate that would be more like a marginal effect would be the average of the above proportions evaluated for each observation using each observation's particular value of SMEAR. The proportion change in each observation can be obtained using this macro call
%nlest(instore=log, f=(logistic(b_p1+1.5*b_p2+smear*b_p3)/logistic(b_p1+0.5*b_p2+smear*b_p3))-1,
score=remiss, outscore=out)
and then averaging the estimated proportions (in variable PRED)
proc means data=out mean; var pred; run;
which yields a similar value, 1.512 or 151.2%. A proper standard error and confidence interval for this average estimate is not available though.
Where does the binary variable in your subject line come in?
Your description sure sounds like a regression. The slope of the regression line is that increase per unit.
But Surveyreg is more for continuous variables like the number of square feet in a house affecting sale price.
SURVEYREG assumes that the response is normally distributed and this is most definitely not true for a binary response. For a binary response, you should fit an appropriate model such as a logistic model. If your data is survey data, then use PROC SURVEYLOGISTIC. The proportion change in response probability for a unit increase in one of the predictors can be computed as (Px+1-Px)/Px = (Px+1/Px)-1, where Px is the response probability at some level of a predictor, X, and Px+1 is the probability after a unit increase in X. It can be expressed as a percent change by multiplying by 100. This change in probability over a range is not, strictly speaking, a marginal effect. Marginal effects and the change in probability are discussed in detail in this note. The note discusses how marginal effects can be estimated using the Margins macro (which cannot be used with survey data) and how the change (difference) in probabilities can be estimated using the NLMeans macro. The proportion change can also be computed at fixed values of the other predictors in the model using the NLEST macro. See the section of the above note titled "Estimating the difference in probability at specific points". Using the logistic model fit there, the following NLEST macro call estimates the proportion change in response probability from BLAST=0.5 to 1.5 with SMEAR fixed at 0.63.
%nlest(instore=log, f=(logistic(b_p1+1.5*b_p2+.63*b_p3)/logistic(b_p1+0.5*b_p2+.63*b_p3))-1, label=PropChng .5 to 1.5)
The resulting estimate is 1.5085, or as a percentage, 150.85%, indicating about a 150% increase in the response probability from BLAST=0.5 to 1.5 at SMEAR=0.63. A standard error, test of equality to zero, and confidence limits are also provided. While PROC LOGISTIC was used here, the same can be done with a model fit in PROC SURVEYLOGISTIC. An estimate that would be more like a marginal effect would be the average of the above proportions evaluated for each observation using each observation's particular value of SMEAR. The proportion change in each observation can be obtained using this macro call
%nlest(instore=log, f=(logistic(b_p1+1.5*b_p2+smear*b_p3)/logistic(b_p1+0.5*b_p2+smear*b_p3))-1,
score=remiss, outscore=out)
and then averaging the estimated proportions (in variable PRED)
proc means data=out mean; var pred; run;
which yields a similar value, 1.512 or 151.2%. A proper standard error and confidence interval for this average estimate is not available though.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Lock in the best rate now before the price increases on April 1.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.