I have distributors that feature a number of products. Some only 1, but others up to 25. I'd like to scan all 25 columns and return if a distributor features that product
The data is aligned as follows:
Distributor_ID First_Product Second_Product......................................25_Product
1 Apples Bananas
2 Grapes
3 Bananas Grapes Kiwis
etc
etc
In EXCEL it's dead easy using COUNTIF(array,value). So, I can create a new column in Excel for each Product. Since Distributor #1 sells Apples, a variable APPLES would have a value of 1 (vs 0)
1) What's the best way to handle this in SAS?
2) If the products were concatenated into a single variable, best way in that case? EX (Apples|Bananas) for Distributor_ID=1
Not sure what you want your output to look like.
There's the whichC function which checks if a value is in a list of other variables, that returns the index of where apples occurred.
x=whichc('APPLES', of product1-product25);
If you need to know how many times APPLES was in the list I'd concatenate them and then use countW
Untested:
want=countw('APPLES', catx(",", of product1-product25, 'i');
Thanks Reeza
Apples will only happen once - same for all the fruits. I'm leaning toward your "concatenated" answer. I'm waiting to get SAS installed but know this challenge is part of a project I'm working on.
A better alternative is to flip your data and run proc freq and easier to query when in long format.
data long;
set have;
array fruits(25) fruit1-fruit25;
do i=1 to 25;
if fruits(i) ne . then fruit=fruit(i);
output;
end;
keep id i fruit;
run;
proc freq data=long;
table id*fruit;
run;
*All ID's that have apple:
data apples;
set long;
where fruit='APPLE';
run;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.