Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- SAS Procedures
- /
- PROC REG with categorical variables and all possible subsets of models

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 02-23-2014 03:51 PM
(2189 views)

Hi there--

I've been using proc reg to generate all possible models sorted by AIC. I've run into a problem though. I have three categorical variables, and proc reg does not accept them as-is. I changed their values from text to numbers (e.g. "urban" because 1 and "suburban" became 0 for my "level of urbanization" category). I threw these back into the model statement, but instead of increasing the number of possible models, it decreased from over 2000 to around 300. Does this make sense? The number of possible models should increase with added variables, right?

Here is my code--the categorical variables are X4, X5, X6:

proc reg data=chill outest=est;

model y1=x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13/ selection=adjrsq sse aic ;

output out=out p=p r=r; run; quit;

proc reg data=chill outest=est0;

model y1=x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 / noint selection=adjrsq sse aic ;

output out=out0 p=p r=r; run; quit;

data estout;

set est est0; run;

proc sort data=estout; by _aic_;

proc print data=estout(obs=8); run;

Did I do something wrong? Or does it make sense for the number of models to decrease?

4 REPLIES 4

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

There is something I am not getting. Both model inputs are the same. The only difference is the absence of an intercept term in the second procedure call. That doesn't correspond to your description. Besides, how could you get a list of (2000?) models from proc reg with character regressors?

Note, when your categorical regressors have N categories, you need N-1 dummy variables to replace them in a regression setting.

PG

PG

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Following up on PG's response--How do you plan to deal with all possible models when some of the independent variables are exclusive categories? Does it make any sense at all to include (for instance) 'urban', and exclude 'suburban', especially when this will exclude a large part of your database?

I have some major doubts about any analysis produced in this manner.

Steve Denham

**Available on demand!**

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.