DATA Step, Macro, Functions and more

Recoding in a new variable

Reply
Regular Contributor
Posts: 199

Recoding in a new variable

I am trying to create one nominal variable w/ 6 categories from 3 existing variables. I want values from 0-5 in the new variable (combo_expo).

 

With the syntax below, the last IF-THEN-ELSE statement overwrites the previous statements. The values are all 0 or 5. This seems pretty straight forward, but I haven't been able to resolve it w/ my own methods. Thanks. 

 

* Never users;
IF ever_used = 2 THEN combo_expo = 1;
ELSE IF ever_used NE 2 THEN combo_expo = 0;

* Past users. Not in the past 30 days;
IF last_used GE 30 THEN combo_expo = 2;
ELSE combo_expo = 0;

* Current, light users;
IF days_use_month LE 4THEN combo_expo = 3;
ELSE combo_expo = 0;

 

* Current, heavy users;
IF days_use_month GE 5 THEN combo_expo = 4;
ELSE combo_expo = 0;

* Chronic users, monthly x 1 year;
IF monthly_yr = 1 THEN combo_expo = 5;
ELSE combo_expo = 0;

Super User
Posts: 17,750

Re: Recoding in a new variable

Are you trying to create 1 new variable or 5 new variables?

 

If you're doing 1, why not make them all ELSE IF after the first condition. 

 

Otherwise, I think you'll need to explain the logic in more detail and with sample data.

Regular Contributor
Posts: 199

Re: Recoding in a new variable

[ Edited ]

I am trying to create 1 new variable. I added ELSE IF statements. My latest syntax is below.

 

I get different results when I change the order of the IF-THE-ELSE statements. This seems like a lack of understanding on my part as to how SAS processes these conditions in the IF-THE-ELSE statements.

 

I have attached the data and a copy of my results I get w/ the syntax below. In this order, the IF-THE-ELSE statements produce the correct counts, except for monthly_yr, which has fewer counts than it should (518 v. 953). 

 

Thanks again for the assistance.

	* Current, light users;
    IF days_use_month LE 4 AND NOT MISSING (days_use_month) THEN combo_expo = 3;

    * Current, heavy users;
    ELSE IF days_use_month GE 5 AND days_use_month LE 30 AND NOT MISSING (days_use_month) THEN combo_expo = 4;

	* Never users;
    ELSE IF ever_used = 2 THEN combo_expo = 1;
    
    * Chronic users, monthly x 1 year;
    ELSE IF monthly_yr = 1 THEN combo_expo = 5;
    
    * Past users. Not in the past 30 days;
    ELSE IF last_used GE 30 THEN combo_expo = 2;

 PROC FREQ.jpg

Attachment
Super User
Posts: 17,750

Re: Recoding in a new variable

I think you need to read through @Astounding post again. It sounds like you haven't quite thought through all your categories that's the issue. 

Super User
Posts: 17,750

Re: Recoding in a new variable

I think you need to read through @Astounding post again. It sounds like you haven't quite thought through all your categories that's the issue. 

Super User
Posts: 10,466

Re: Recoding in a new variable

You may want to run some code like this against your data to see that actual combinations that you have.

proc freq data=have;
  tables ever_used *last_used *  days_use_month * monthly_yr *combo_expo/ list missing;
run;

Which may reveal that some of the values you expect to come from monthly_yr are being appropriated by a different set of variables.

 

Super User
Posts: 5,072

Re: Recoding in a new variable

You probably need to think through the plan a little more.  You actually have 32 categories, so 0-5 won't be enough.  The logic of the variables may rule out some of the combinations, but it's a bad idea to rule them out without having the data tell you whether or not they exist.  For example, the data might indicate that both "never a user" and "current light user" are both true.  Here's one train of thought to get you started:

 

combo_expo='00000';


IF ever_used = 2 THEN substr(combo_expo, 5, 1) = '1';
IF last_used GE 30 THEN substr(combo_expo, 4, 1) = '2';
IF days_use_month LE 4 THEN substr(combo_expo, 3, 1) = '3';
IF days_use_month GE 5 THEN substr(combo_expo, 2, 1) = '4';

IF monthly_yr = 1 THEN substr(combo_expo, 1, 1) = '5';

 

You can always change the order if you would like.  Then run a PROC FREQ on combo_expo to get a picture of what your data contains.

 

 

PROC Star
Posts: 1,558

Re: Recoding in a new variable

[ Edited ]

Like this?

 

  COMBO_EXPO=ifn(MONTHLY_YR     eq  1, 5   %* Chronic users, monthly x 1 year    ;
            ,ifn(DAYS_USE_MONTH ge  5, 4   %* Current, heavy users               ;
            ,ifn(DAYS_USE_MONTH le  4, 3   %* Current, light users               ;
            ,ifn(LAST_USED      ge 30, 2   %* Past users. Not in the past 30 days;
            ,ifn(EVER_USED      eq  2, 1   %* Never users                        ;
            ,                          0)))));

 

 

Super User
Super User
Posts: 7,392

Re: Recoding in a new variable

If you only want the first condition then maybe a select statement:

data want;
  set have;
  select;
    when(ever_used=2) combo_expo=1;
    when(last_used >= 30) combo_expo=2;
    ...
    otherwise combo_expo=0;
  end;
run;
   
Ask a Question
Discussion stats
  • 8 replies
  • 149 views
  • 1 like
  • 6 in conversation