I’m brand new to sas and funding it hard to get the results I’m wanting.
Using multivariate regression I have decided to model total volume by subgroups of brand.
Im using sas studio to do this but have never done this analysis by subgroups before only by category as a whole. Is there any difference?
looking at my results the parameter estimates tables has 6 rows, intercept,unit,price,brand 1,2 and 3 however brand 3 has a 0 degree of freedom which is making me think my method isn’t correct. Any advices would be great thank you!
My code:
proc glmselect data=MS3S30.MARKET1 outdesign(addinputvars)=Work.reg_design; class brand_id / param=glm; model volume=units price brand_id / showpvalues selection=none; run;
Using multivariate regression I have decided to model total volume by subgroups of brand.
Im using sas studio to do this but have never done this analysis by subgroups before only by category as a whole.
It would help if you showed your code. Do you mean you added a BY BRAND; statement into the PROC?
Is there any difference?
Yes, if I am understanding you properly. When you use BY BRAND; you get individual regressions with slopes and intercept changing for each value of BRAND. When you put BRAND into the MODEL statement, you get one overall slope, one overall intercept, and an effect for each level of BRAND.
looking at my results the parameter estimates tables has 6 rows, intercept,unit,price,brand 1,2 and 3 however brand 3 has a 0 degree of freedom which is making me think my method isn’t correct.
Yes, that's the correct output. I wrote an explanation here.
@laurenhosking wrote:
I added my code into the question but thank you so much that does make sense and the explanation really helped!
In previous questions if pr>|t| is less than 0.05 I have removed the variable and rerun. in my case brand 1 and 2 are so am i correct in assuming this is still expectable to do? And as brand 3 has no value id leave this in?
But Brand 3 does have a value. As I showed in my example, there is a mean for each group.
Also, a criterion for removing variables is pr>|t| is greater than 0.05.
Also, normally you don't use PROC GLMSELECT if all you have are three variables in the model.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.