BookmarkSubscribeRSS Feed
ccherrub
Obsidian | Level 7

I'm trying to run a regresson model on minority enrolled university students through years 1994-2022 in different states. My code looks similar to:

 

ods output ParameterEstimates=PEforModel1 DataSummary=ObsModel1 FitStatistics=AdjsqModel1 Effects=OverallSigModel1;
Proc SurveyReg data=work.project3 plots=none;
class state year / Ref=first;
Model1: Model MisP = state*year/Solution Adjrsq;
run;

 

With an output of

sas.png

 

I feel as if I should create a DID varaible but I'm not sure how to do that when my data looks like:

sas2.png

 

Can anyone see where the problem may be? I could fix the input data anytime which I feel may also be the issue. Thanks!

4 REPLIES 4
StatDave
SAS Super FREQ

I see that your response values are proportions. If these are proportions responding on some binary variable and you have the numerator and denominator counts making up the proportions, then you should be using a logistic model. If the data are not survey data (you aren't using the CLUSTER or STRATA statement), then you just need PROC LOGISTIC to fit the model. You can add an LSMEANS statement with the ILINK option to get the fitted proportions for the STATE*YEAR combinations. For example

proc logistic data=work.project3;
class state year / param=glm;
model num_events/total = state*year / noint;
lsmeans state*year / ilink; run;

If the model fit is still a problem, as it appears to be in the results you showed, then you will probably need to merge together some levels of STATE and/or YEAR to remove any sparseness in the data. 

 

But if your goal is to make difference in difference (DID) comparisons then you should review the "Generalized Linear Models with a Non-Identity Link" section of this note.  As shown there, you can use Margins macro with appropriate contrasts data set to make the various DID comparisons since you have many levels of both STATE and YEAR.

ccherrub
Obsidian | Level 7
I'm getting a full table or results I want. It may be me interpreting your directions. However, I can't seem to find a way to regress my above varible on three different conditions fused together. I would like to see state and year to be regressed on. I did make a difference column with a lag function and subtracting that from the row however, I haven't found a way it's useful.
ccherrub
Obsidian | Level 7
Not getting the table* I meant
StatDave
SAS Super FREQ

It would help if you confirmed whether your response is a proportion of a binary variable and if you have the numerator and denominator counts., and then exactly what it is that you are trying to compute or estimate. It is not at all clear what "all 3 conditions" are and how that figures into the model.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 754 views
  • 1 like
  • 2 in conversation