Solved
Contributor
Posts: 20

# Re-catorgorize data into new variables

I'm trying to organize a continuous variable into smaller blocks. For example. Im using CD4 counts 1-1500. I want to break the CD4 count variable into smaller blocks for analyzation. 1-200 would be low CD4. 201-500 would be medium CD4 level and 501 and above would be high CD4 level.

what code would I use to do this, if it could be done?

Accepted Solutions
Solution
‎01-12-2018 04:37 PM
Super User
Posts: 13,507

## Re: Re-catorgorize data into new variables

[ Edited ]
Posted in reply to UGAstudent

UGAstudent wrote:

I'm trying to organize a continuous variable into smaller blocks. For example. Im using CD4 counts 1-1500. I want to break the CD4 count variable into smaller blocks for analyzation. 1-200 would be low CD4. 201-500 would be medium CD4 level and 501 and above would be high CD4 level.

what code would I use to do this, if it could be done?

An example:

proc format library=work;
/* Format names cannot end in a digit to
avoid confusion with display length options
*/
value cd4_
0  -  200 = 'Low'
200<- 500 = 'Medium'
500<-high = 'High'
;
run;

proc freq data=yourdata;
tables cd4;
format cd4 cd4_. ;
run;

Formats are very powerful tools as you can create multiple formats and use as needed. Almost all of the analysis procedures will honor groupings assigned in custom formats.

All Replies
PROC Star
Posts: 1,770

## Re: Re-catorgorize data into new variables

[ Edited ]
Posted in reply to UGAstudent

Use proc format aka user defined format

This link might be of help-->

http://www2.sas.com/proceedings/sugi27/p056-27.pdf

Solution
‎01-12-2018 04:37 PM
Super User
Posts: 13,507

## Re: Re-catorgorize data into new variables

[ Edited ]
Posted in reply to UGAstudent

UGAstudent wrote:

I'm trying to organize a continuous variable into smaller blocks. For example. Im using CD4 counts 1-1500. I want to break the CD4 count variable into smaller blocks for analyzation. 1-200 would be low CD4. 201-500 would be medium CD4 level and 501 and above would be high CD4 level.

what code would I use to do this, if it could be done?

An example:

proc format library=work;
/* Format names cannot end in a digit to
avoid confusion with display length options
*/
value cd4_
0  -  200 = 'Low'
200<- 500 = 'Medium'
500<-high = 'High'
;
run;

proc freq data=yourdata;
tables cd4;
format cd4 cd4_. ;
run;

Formats are very powerful tools as you can create multiple formats and use as needed. Almost all of the analysis procedures will honor groupings assigned in custom formats.

Contributor
Posts: 20

## Re: Re-catorgorize data into new variables

This worked well.

However, Im trying to do a linear regression but it says the newly created variables are not found?

proc Reg data=Import;
title "Example of linear regression";
model Factor1 = Low;
run;

Posts: 2,985

## Re: Re-catorgorize data into new variables

[ Edited ]
Posted in reply to UGAstudent

Low is not a variable. It is a formatted value.

If your goal is to do a regression on just the low values, none of the formatting is needed.

proc Reg data=Import(where=(0<=cd4<=200));
title "Example of linear regression";
model Factor1 = cd4;
run;

P.S.: It is always helpful to say what analysis you would like to do in your original post.

--
Paige Miller
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
• 4 replies
• 219 views
• 1 like
• 4 in conversation