I want to filter a dataset based on a set of condition. The problem is the number of conditions is a lot and I was planning to have those conditions mentioned in an excel file like this:
ID COUNT TYPE
1 2 A
1 3 B
2 1 C
And I want the conditions to be like:
data to_drop; set previous_data; where (ID = 1 and COUNT > 2 and TYPE = 'A') or (ID = 1 and COUNT > 3 and TYPE = 'B') or (ID = 2 and COUNT > 1 and TYPE = 'C'); run;
But instead of manually adding all the hundreds of conditions to the code, I want them to be picked up from the excel.
First of all: do NOT use Excel for anything that is supposed to be a reliable solution. DON'T.
Use a DATA step with DATALINES instead:
data lookup;
input ID $ COUNT TYPE $;
datalines;
1 2 A
1 3 B
2 1 C
;
Now, suppose you have a dataset like this:
data have;
input ID $ COUNT TYPE $;
datalines;
1 2 A
1 2 Z
1 3 B
1 3 Y
2 1 C
2 2 C
;
you can use a hash object to do your subset:
data want;
set have;
if _n_ = 1
then do;
declare hash l (dataset:"lookup");
l.definekey("ID","COUNT","TYPE");
l.definedone();
end;
if l.check() = 0;
run;
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.