Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Forecasting
- /
- Re: How SAS calculates regression with dummy variables?

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 06-16-2017 12:04 PM
(4540 views)

Hello, everybody.

I want to regress dummy variables, which are time-based, on volume and use PROC GENMOD and PROC GLM statements to create dummies automatically.

In addition, I use DATA statement to create dummies manually. I have seven dummies which are classified as below:

Dummy_1: 9:00 << Time < 9:30;

Dummy_2: 9:30 << Time < 10:00;

Dummy_3: 10:00 << Time < 10:30;

Dummy_4: 10:30 << Time < 11;

Dummy_5: 11:00 << Time < 11:30;

Dummy_6: 11:30 << Time < 12;

Dummy_7: 12 << Time < 12:30;

Here are some examples of my codes:

```
* Regressing dummy variables on normalized volume variable using calculated volume;
proc genmod data=Sampledata_adjvol;
class TRD_EVENT_ROUFOR / param=effect;
model adjusted_volume = TRD_EVENT_ROUFOR / noscale;
ods select ParameterEstimates;
run;
* Same analysis by using the CLASS statement;
proc glm data=Sampledata_adjvol;
class TRD_EVENT_ROUFOR; /* Generates dummy variables internally */
model adjusted_volume = TRD_EVENT_ROUFOR / solution;
ods select ParameterEstimates;
quit;
```

```
* Creating dummy variables manually;
data Sampledata_adjvol_DumVar;
set Sampledata_adjvol ;
if TRD_EVENT_ROUNDED = 34200 then TRD_EVENT_ROUNDED_1 = 1;
else TRD_EVENT_ROUNDED_1 = 0;
if TRD_EVENT_ROUNDED = 36000 then TRD_EVENT_ROUNDED_2 = 1;
else TRD_EVENT_ROUNDED_2 = 0;
if TRD_EVENT_ROUNDED = 37800 then TRD_EVENT_ROUNDED_3 = 1;
else TRD_EVENT_ROUNDED_3 = 0;
if TRD_EVENT_ROUNDED = 39600 then TRD_EVENT_ROUNDED_4 = 1;
else TRD_EVENT_ROUNDED_4 = 0;
if TRD_EVENT_ROUNDED = 41400 then TRD_EVENT_ROUNDED_5 = 1;
else TRD_EVENT_ROUNDED_5 = 0;
if TRD_EVENT_ROUNDED = 43200 then TRD_EVENT_ROUNDED_6 = 1;
else TRD_EVENT_ROUNDED_6 = 0;
if TRD_EVENT_ROUNDED = 45000 then TRD_EVENT_ROUNDED_7 = 1;
else TRD_EVENT_ROUNDED_7 = 0;
run;
proc freq data=Sampledata_adjvol_DumVar;
tables TRD_EVENT_ROUNDED*TRD_EVENT_ROUNDED_1*TRD_EVENT_ROUNDED_2*TRD_EVENT_ROUNDED_3*TRD_EVENT_ROUNDED_4*TRD_EVENT_ROUNDED_5*TRD_EVENT_ROUNDED_6*TRD_EVENT_ROUNDED_7 / list ;
run;
* Regressing dummy variables on normalized volume variable using calculated volume;
ods graphics on;
proc reg data = Sampledata_adjvol_DumVar plots(maxpoints = none);
model adjusted_volume = TRD_EVENT_ROUNDED_1 TRD_EVENT_ROUNDED_2 TRD_EVENT_ROUNDED_3 TRD_EVENT_ROUNDED_4 TRD_EVENT_ROUNDED_5 TRD_EVENT_ROUNDED_6 TRD_EVENT_ROUNDED_7;
run;
ods graphics off;
```

The results are attached to this post.

Why the final dummy is not estimated?

What is the problem?

How can I fix that?

Thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

2 REPLIES 2

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

As explained above, if you have N levels, you can only estimate n-1 coefficients plus the intercept. If you leave the intercept out of the model, then you can estimate all N levels. This is basic math.

Also, you keep writing something like this, in this and other threads

First half an hour: 9:00 << Dummy_1 < 9:30;

which makes absolutely no sense at all, dummy_1 is either 0 or 1 (otherwise it's not a dummy variable), and a variable that has values of 0 or 1 cannot be between 9:00 and 9:30. You most likely mean

dummy1 = 9:00 <= time_1 < 9:30;

(which might not be correct syntax, but you get the idea)

so I would hope that you will write more meaningful and understandable math and SAS code in the future.

--

Paige Miller

Paige Miller

**SAS Innovate 2025** is scheduled for May 6-9 in Orlando, FL. Sign up to be **first to learn** about the agenda and registration!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.