Hello folks, I am new to SAS modeling and don't have a strong statistics background. I am working on setting up a model to predict a continuous variable (money spent) using a few independent variables: ethnicity, age groups, residence type, services being used some interactions . I tried both PROC GLIMMIX and PROC GLM and they give me very different results and it looks like GLM is more suitable for the situation. Can PROC GLIMMIX be used when the outcome is continuous?
PROC glm data=POP1 plots(only)=(meanplot(cl));
WHERE AGEGRP1 NE '';
CLASS eth2(ref='White') res(ref="in home") highfreq agegrp1(ref='56+');
MODEL POS = ETH2 AGEGRP1 RES eth2*res ETH2*AGEGRP1 HIGHFREQ/solution;
output out=GLMOut Predicted=Pred lclm=lclm uclm=uclm;
lsmeans ETH2*AGEGRP1 / cl;
run;
quit;
below is the plot I obtained comparing LSmean of my outcome variable:
My question is: is the above chart showing predicted values (or marginal effects) of the outcome? How to interpret the results i am seeing here? I also want to compare the observed difference in the outcome by ethnicity and age groups before running the model and the predicted difference in the outcome variable after running the model, and am i expected to see the difference become smaller assuming the model was set up properly? The main research question is what are the factors contributing to the difference in the outcome between the two ethnicity groups and how to adjust them so that hopefully we no longer see much difference between the two. Would appreciate any feedback or input.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.