DATA Step, Macro, Functions and more

Dataset categorized column

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 5
Accepted Solution

Dataset categorized column

Hello! I'm starting with sas base and I have a question.

 

I'm not able to find the way to create a new column which categorizes different words form other column in the same dataset.

I need to create a column (named category) with 3 different variables: "Clothes", "Bags" and "Other". So when the main column says t-shirt in the new category colum would say "clothes" if it says shoes the new columns would say other, etc

Clothes would search words like t-shirts, hats, etc

bags would search bagpacks, packs, etc

and other the rest of words 

 

Thank you for the help!


Accepted Solutions
Solution
‎11-06-2016 01:59 PM
PROC Star
Posts: 768

Re: Dataset categorized column

Be aware of handling different cases of text like this:

 

data want;
   set have;

   if lowcase(main_column) in ('t-shirts', 'hoodies', 'hats') then do;
      new_column = 'Clothes';
   end;
   else if lowcase(main_column) in ('bags', 'backpacks') then do;
      new_column = 'Bags';
   end;
   else do;
      new_column = 'Other';
   end;
run;

View solution in original post


All Replies
PROC Star
Posts: 768

Re: Dataset categorized column

[ Edited ]

Something like this? Smiley Happy

 

data want;
   set have;

   if main_column in ('T-Shirt' /*, Insert other types of clothes here separated by , */) then do;
      new_column = 'Clothes';
   end;
   else if main_column in ('Shoes') then do;
      new_column = 'Other';
   end;
   else if main_column in ('Some Bags Name') then do;
      new_column = 'Bags';
   end;
run;
Frequent Learner
Posts: 1

Re: Dataset categorized column

[ Edited ]
 
PROC Star
Posts: 768

Re: Dataset categorized column

Not sure I understand completely, but you can put as many words inside the parenthesis as you want as below? Smiley Happy

 

data want;
   set have;

   if main_column in ('T-Shirt' /*, Insert other types of clothes here separated by , */) then do;
      new_column = 'Clothes';
   end;
   else if main_column in ('Shoes', 'Socks', 'Ties') then do;
      new_column = 'Other';
   end;
   else if main column in ('Some Bags Name') then do;
      new_column = 'Bags';
   end;
run;
Occasional Contributor
Posts: 5

Re: Dataset categorized column

I know, thank you again for your fast reply Smiley Happy

I mean, imagine you have 1000 words to put in the parenthesis, is there a way to say this;

T-shirts, hoodies, hats = clothes
bags, backpacks = bags
all the rest of the words = other ?

PROC Star
Posts: 768

Re: Dataset categorized column

Ah, now I get it Smiley Happy Sure:

 

data want;
   set have;

   if main_column in ('T-shirts', 'hoodies', 'hats') then do;
      new_column = 'Clothes';
   end;
   else if main_column in ('bags', 'backpacks') then do;
      new_column = 'Bags';
   end;
   else do;
      new_column = 'Other';
   end;
run;
Occasional Contributor
Posts: 5

Re: Dataset categorized column

that's great, thank you! Smiley Very Happy
PROC Star
Posts: 768

Re: Dataset categorized column

Anytime. Please mark my answer as a solution if it solved your problem Smiley Happy Regards.

Solution
‎11-06-2016 01:59 PM
PROC Star
Posts: 768

Re: Dataset categorized column

Be aware of handling different cases of text like this:

 

data want;
   set have;

   if lowcase(main_column) in ('t-shirts', 'hoodies', 'hats') then do;
      new_column = 'Clothes';
   end;
   else if lowcase(main_column) in ('bags', 'backpacks') then do;
      new_column = 'Bags';
   end;
   else do;
      new_column = 'Other';
   end;
run;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 278 views
  • 0 likes
  • 3 in conversation