Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- Programming
- /
- How to create dummy variables for an interval categorical variable to ...

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 12-03-2018 06:11 PM
(1235 views)

Dear SAS Community,

I am trying to structure my data to run a regression including a categorical variable with multiple interval levels. The variable is viral load control, which was created by looking at lab results across a longitudinal period (VLCONTROL), which has a range of possible values from 1 - 4.

The variable is formatted as a mutually exclusive set of ranges for each response category as defined below.

If all were <50 then vlcontrol=1

If any >=50 and all <400 then vlcontrol=2

If any >=400 and <1000 then vlcontrol=3

If any >=1000 then vlcontrol=4.

I attempted to create a series of dummy variables to run a linear regression in proc reg, but I got the following error.

SAS Output

Note: | Model is not full rank. Least-squares solutions for the parameters are not unique. Some statistics will be misleading. A reported DF of 0 or B means that the estimate is biased. |

Note: | The following parameters have been set to 0, since the variables are a linear combination of other variables as shown. |

Below is how I coded the dummy variables and how I ran the model:

```
data regression;
keep ctvalue vlcontrol vlcont_50 vlcont_400 vlcont_1000 vlcont_ge1000;
set have;
vlcont_50=0;
if vlcontrol=1 then vlcont_50=1;
vlcont_400=0;
if vlcontrol=2 then vlcont_400=1;
vlcont_1000=0;
if vlcontrol=3 then vl_1000=1;
vlcont_ge1000=0;
if vlcontrol=4 then vlcont_ge1000=1;
run;
proc reg data=regression outest=est;
model ctvalue=vlcont_50 vlcont_400 vlcont_1000 vlcont_ge1000/ clb rsquare tol vif aic bic;
id pid;
ods output ParameterEstimates=PE;
run;quit;
```

Not sure how to do this better. I know I can do this in PROC GLM etc, but I want the diagnostic features for building a model in PROC REG. I've tried not including the last dummy variable, vlcontrol_ge1000, in the model, but still get the same error note in the output. Is there a better way I should be coding my dummy variables?

Thanks,

Cara

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Generally, when you have 4 levels, you only need 3 dummy variables to accurately represent it, not 4. So you can remove one of your dummy variables, which becomes your reference level and the remaining will be fine.

2 REPLIES 2

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks, this worked! I had to create the dummy variable for the last level and just leave it out of the model for it to work.

**Don't miss out on SAS Innovate - Register now for the FREE Livestream!**

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.