07-30-2017 09:44 AM
I want to run a regression with dummies. My dummies are half-hour period variables which are from 9 to 12 o'clock and I have totally six dummies. However, SAS just predict five dummies and one intercept in a regression model as you know and the last dummy is the intercept subtracted by the aggregate of other dummies (in my research the dummy variable which is made for half-hour period from 11:30 to 12 o'clock). SAS choose which dummies to be the intercept and I want to choose it by myself.
I use codes which are shown below:
proc glm data=Sampledata_02_mer; class TRD_EVENT_ROUFOR; model adjusted_volume = TRD_EVENT_ROUFOR / solution; ods select ParameterEstimates; quit;
How can I do that?
Thanks in advance.
07-30-2017 09:51 AM - edited 07-30-2017 09:53 AM
In the CLASS statement, use the REF= option.
Although, for most purposes, there is no reason to do this at all. This doesn't change the model fit, it doesn't change the predicted values, it doesn't change the interpretation of anything. The only reason I can think of to actually use the REF= option is if you are running a scientific experiment where one level of the CLASS variable is a control group, a situation you don't have.
07-30-2017 12:20 PM
07-30-2017 05:17 PM - edited 07-30-2017 05:27 PM
The "REF=option" just can be first and last. However, I have six dummies that I want to choose between them and the result absolutely different by changing the reference category.
One of the reasons I provide the link to the proper documentation so that you (and everyone) can see there is a third option which is not first and is not last. This third option is the one you want, as shown by @Reeza.
As I tried to point out above, the use of these coefficients is problematic, as you have already shown that depending on what you change in the REF option, the coefficients change. THEY ARE NOT UNIQUE. THERE IS NO CORRECT UNIQUE ANSWER. THERE IS NO CORRECT UNIQUE MODEL parameterization. And yet they are all the same model.
You have to be careful in your interpretation of these coefficients. To insist that you want a specific REF= option is fine if you know how to interpret them, so let me explain. The delta between any two coefficients is constant, regardless of the REF= option, but the individual coefficients are not to be interpreted in any other way. You can interpret the delta between two coefficients, but not the coefficients themselves.
Better yet, you should compute LSMEANS for each level of your category variable. THESE ARE UNIQUE. These tell you what you really want to know. Make your life simple, interpret the LSMEANS and not the model coefficients.
07-30-2017 12:50 PM
Here is my first result:
When I use "REF=first", the result is:
Question01: Why the result have changed when I use "ref" that it is contrary to what you mentioned?
Question02: The "REF" option can be "first" or "last" and I want to choose reference category between them. How can I do that?
07-30-2017 10:05 AM
Use the ORDER= option in the PROC GLM statement to control the order that the procedure interprets the class variables as in this small example
proc sort data = sashelp.iris out=iris; by descending species; run; proc glm data = iris order=data; class species; model SepalLength = species / solution; ods select ParameterEstimates; run;quit;
You can see other ways of controlling it in the ORDER= part of the PROC GLM documentation here
Need further help from the community? Please ask a new question.