BookmarkSubscribeRSS Feed
monsterpie
Obsidian | Level 7

Hi all,

 

I have 5 variables (bev_1-bev_5) that are open text responses. I am trying to create a new variable for each bev_1-bev_5 which classifies the responses in each column as either 1 or 2. The first new variable that I want to create, called product_type  should have 2 levels: if "Coca-Cola" or "Pepsi" are found in bev_1-bev_5 then the the variable product should =1. If "Hamburger" is found in any observations of bev_1-bev_5 then product should =2.

 

An example of my dataset is below. Variables product_1-product_5 are the new variables I am trying to create. Product_1 classifies the responses under bev_1, product_2 classifies responses of bev_2, etc.

 

IDbev_1bev_2bev_3bev_4bev_5product_1product_2product_3product_4product_5
1CokeCokePepsiHamburgerHamburger11122
2CokeHamburgerPepsiHamburgerHamburger12122
3PepsiPepsiPepsiHamburgerCoke11121

 

Here is an example of the progress I've made so far with my code. This code runs but ouputs only one variable and identifys which column "Coke" or "Pepsi" is located.  I'm also not sure if an array is the most appropriate function to use in this case. Any help is appreciated!

data want;
set have;
array x{*} bev_1-bev_5;
product_type=whichc("Coke","Pepsi",of x{*});
run;

 

4 REPLIES 4
mkeintz
PROC Star
data want;
  set have;
  array x{*} bev_1-bev_5;
  array product_ {5};
  do i=1 to 5;
    if x{i} in ('Coke','Pepsi') then product_{i}=1;
    else if x{i}='Hamburger' then product_{i}=2;
  end;
  drop i;
run;

ARRAY is the right tool to use, but you should also implement a do loop over the array elements:

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Reeza
Super User
Do you have a mapping table for these values? If it's just the two sets of conditions it's not hard but this won't scale well if you have more data and categories. If you had a table that maps the value, Coke = 1, Pepsi=1, Hamburger=2 in a data set then this could be done nicely with a format and array.



monsterpie
Obsidian | Level 7
I'm really new to SAS so I'm not exactly sure what a mapping table is? I will need to scale this up because I have more categories and will be creating more variables using the code based on this post.
Reeza
Super User
A look up table, that has the original values and the values you want them to be. Basically do you have a table that looks like:


Code Product
1 Coke
1 Pepsi
2 Hamburger
....

Use a format, see this document on how Formats work:
https://support.sas.com/resources/papers/proceedings/proceedings/sugi30/001-30.pdf
https://support.sas.com/resources/papers/proceedings/proceedings/sugi30/001-30.pdf

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 913 views
  • 1 like
  • 3 in conversation