DATA Step, Macro, Functions and more

create two new variables based on values of original variable

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 87
Accepted Solution

create two new variables based on values of original variable

Thanks a lot in advance, friends. Below are samples of both what i have and what i want.

 

data have;
input category $25. ;
cards;
APPLEPLUS
A1
A2
3
4
BANANAPLUS
3
B5
6
PEARPLUS
2
Q15
C17
3
Z02
;
RUN;

 

DATA WANT;
input category $25. category1 $10. ;;
cards;
APPLEPLUS  A1
APPLEPLUS  A2
APPLEPLUS  3
APPLEPLUS  4
BANANAPLUS  3
BANANAPLUS  B5
BANANAPLUS   6
PEARPLUS     2
PEARPLUS    Q15
PEARPLUS     C17
PEARPLUS      3
PEARPLUS Z02
;
RUN;

 

I tried using the following code, but it doesnt get me much. The values such as appleplus, BANANAPLUS etc are typically long all character but the others are much smaller and are either numerical or starting with an alphabet.

 

DATA WANT;
SET HAVE;
LENGTH CATEGORY1 $25.;
IF LENGTH(CATEGORY)> 7 THEN CATEGORY1 = CATEGORY;
RUN;


Accepted Solutions
Solution
‎12-13-2016 05:33 PM
Super User
Posts: 5,083

Re: create two new variables based on values of original variable

[ Edited ]

It's a one-step solution that skips creating HAVE and produces WANT from the original data.

 

I guess if you already have the data set HAVE, you could get WANT with:

 

data want;

set have (rename=(category=text));

if length(text) > 7 then category=text;

else do;

   category1=text;

   output;

end;

retain category;

drop text;

run;

View solution in original post


All Replies
Valued Guide
Posts: 797

Re: create two new variables based on values of original variable

Using these notions, you should be able to construct a data step to do the task:

 

  1. Rename category to category1 when reading HAVE, so now category is a new variable and category1 is old.
  2. Instead of looking for a long word to determine the presence of a new category, why not look for the string 'PLUS', as in
       if find(category1,'PLUS') then category=category1;
  3. output only if #2 is not true (because it will have category, but not category1).
  4. Retain the "new" variable category, to carray foward to the subsequent category1 series.

 

You could also do analogous logic when reading in the raw data, i.e. straight from raw data to WANT, no need for HAVE.

 

Super User
Posts: 5,083

Re: create two new variables based on values of original variable

One possibility:

 

data want;
input text $25. ;

length category $ 25 category1 $ 10;

if length(text) > 7 then category=text;

else do;

   category1=text;

   output;

end;

retain category;
cards;
APPLEPLUS
A1
A2
3
4

BANANAPLUS
3
B5
6
PEARPLUS
2
Q15
C17
3
Z02
;

 

Frequent Contributor
Posts: 87

Re: create two new variables based on values of original variable

Thanks Astounding, but not sure what you got here. I mean my want will give you the final output i want, so not sure what you mean by cards statement in your solution. 

Also, the names such as appleplus and bananaplus are examples only, real names are 2-3 word characters. I have around 30,000 records.

Solution
‎12-13-2016 05:33 PM
Super User
Posts: 5,083

Re: create two new variables based on values of original variable

[ Edited ]

It's a one-step solution that skips creating HAVE and produces WANT from the original data.

 

I guess if you already have the data set HAVE, you could get WANT with:

 

data want;

set have (rename=(category=text));

if length(text) > 7 then category=text;

else do;

   category1=text;

   output;

end;

retain category;

drop text;

run;

Super User
Posts: 9,681

Re: create two new variables based on values of original variable


data have;
input category $25. ;
cards;
APPLEPLUS
A1
A2
3
4
BANANAPLUS
3
B5
6
PEARPLUS
2
Q15
C17
3
Z02
;
RUN;
data want;
 set have;
 length cat $ 40;
 retain cat;
 if anydigit(category)=0 then do;
  cat=category;delete;
 end;
run;

Frequent Contributor
Posts: 87

Re: create two new variables based on values of original variable

Thanks KSHARP. This one works as well, perhaps little better because there is no dependence on length.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 211 views
  • 2 likes
  • 4 in conversation