Hello, everybody.
I want to regress dummy variables, which are time-based, on volume and use PROC GENMOD and PROC GLM statements to create dummies automatically.
In addition, I use DATA statement to create dummies manually. I have seven dummies which are classified as below:
Dummy_1: 9:00 << Time < 9:30;
Dummy_2: 9:30 << Time < 10:00;
Dummy_3: 10:00 << Time < 10:30;
Dummy_4: 10:30 << Time < 11;
Dummy_5: 11:00 << Time < 11:30;
Dummy_6: 11:30 << Time < 12;
Dummy_7: 12 << Time < 12:30;
Here are some examples of my codes:
* Regressing dummy variables on normalized volume variable using calculated volume;
proc genmod data=Sampledata_adjvol;
class TRD_EVENT_ROUFOR / param=effect;
model adjusted_volume = TRD_EVENT_ROUFOR / noscale;
ods select ParameterEstimates;
run;
* Same analysis by using the CLASS statement;
proc glm data=Sampledata_adjvol;
class TRD_EVENT_ROUFOR; /* Generates dummy variables internally */
model adjusted_volume = TRD_EVENT_ROUFOR / solution;
ods select ParameterEstimates;
quit;
* Creating dummy variables manually;
data Sampledata_adjvol_DumVar;
set Sampledata_adjvol ;
if TRD_EVENT_ROUNDED = 34200 then TRD_EVENT_ROUNDED_1 = 1;
else TRD_EVENT_ROUNDED_1 = 0;
if TRD_EVENT_ROUNDED = 36000 then TRD_EVENT_ROUNDED_2 = 1;
else TRD_EVENT_ROUNDED_2 = 0;
if TRD_EVENT_ROUNDED = 37800 then TRD_EVENT_ROUNDED_3 = 1;
else TRD_EVENT_ROUNDED_3 = 0;
if TRD_EVENT_ROUNDED = 39600 then TRD_EVENT_ROUNDED_4 = 1;
else TRD_EVENT_ROUNDED_4 = 0;
if TRD_EVENT_ROUNDED = 41400 then TRD_EVENT_ROUNDED_5 = 1;
else TRD_EVENT_ROUNDED_5 = 0;
if TRD_EVENT_ROUNDED = 43200 then TRD_EVENT_ROUNDED_6 = 1;
else TRD_EVENT_ROUNDED_6 = 0;
if TRD_EVENT_ROUNDED = 45000 then TRD_EVENT_ROUNDED_7 = 1;
else TRD_EVENT_ROUNDED_7 = 0;
run;
proc freq data=Sampledata_adjvol_DumVar;
tables TRD_EVENT_ROUNDED*TRD_EVENT_ROUNDED_1*TRD_EVENT_ROUNDED_2*TRD_EVENT_ROUNDED_3*TRD_EVENT_ROUNDED_4*TRD_EVENT_ROUNDED_5*TRD_EVENT_ROUNDED_6*TRD_EVENT_ROUNDED_7 / list ;
run;
* Regressing dummy variables on normalized volume variable using calculated volume;
ods graphics on;
proc reg data = Sampledata_adjvol_DumVar plots(maxpoints = none);
model adjusted_volume = TRD_EVENT_ROUNDED_1 TRD_EVENT_ROUNDED_2 TRD_EVENT_ROUNDED_3 TRD_EVENT_ROUNDED_4 TRD_EVENT_ROUNDED_5 TRD_EVENT_ROUNDED_6 TRD_EVENT_ROUNDED_7;
run;
ods graphics off;
The results are attached to this post.
Why the final dummy is not estimated?
What is the problem?
How can I fix that?
Thanks in advance.
Look up "dummy variable trap." With dummy variables, you need the number of levels - 1 in the model. The one left out is the base against which the other paremeters are based. At the most basic level, with two levels you need one dummy and the coefficient is the the effect of being one compared to whatever is zero.
Look up "dummy variable trap." With dummy variables, you need the number of levels - 1 in the model. The one left out is the base against which the other paremeters are based. At the most basic level, with two levels you need one dummy and the coefficient is the the effect of being one compared to whatever is zero.
As explained above, if you have N levels, you can only estimate n-1 coefficients plus the intercept. If you leave the intercept out of the model, then you can estimate all N levels. This is basic math.
Also, you keep writing something like this, in this and other threads
First half an hour: 9:00 << Dummy_1 < 9:30;
which makes absolutely no sense at all, dummy_1 is either 0 or 1 (otherwise it's not a dummy variable), and a variable that has values of 0 or 1 cannot be between 9:00 and 9:30. You most likely mean
dummy1 = 9:00 <= time_1 < 9:30;
(which might not be correct syntax, but you get the idea)
so I would hope that you will write more meaningful and understandable math and SAS code in the future.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.