I want to filter a dataset based on a set of condition. The problem is the number of conditions is a lot and I was planning to have those conditions mentioned in an excel file like this:
ID COUNT TYPE
1 2 A
1 3 B
2 1 C
And I want the conditions to be like:
data to_drop; set previous_data; where (ID = 1 and COUNT > 2 and TYPE = 'A') or (ID = 1 and COUNT > 3 and TYPE = 'B') or (ID = 2 and COUNT > 1 and TYPE = 'C'); run;
But instead of manually adding all the hundreds of conditions to the code, I want them to be picked up from the excel.
First of all: do NOT use Excel for anything that is supposed to be a reliable solution. DON'T.
Use a DATA step with DATALINES instead:
data lookup;
input ID $ COUNT TYPE $;
datalines;
1 2 A
1 3 B
2 1 C
;
Now, suppose you have a dataset like this:
data have;
input ID $ COUNT TYPE $;
datalines;
1 2 A
1 2 Z
1 3 B
1 3 Y
2 1 C
2 2 C
;
you can use a hash object to do your subset:
data want;
set have;
if _n_ = 1
then do;
declare hash l (dataset:"lookup");
l.definekey("ID","COUNT","TYPE");
l.definedone();
end;
if l.check() = 0;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.