Thanks for the response! This could definitely achieve the desired results, but my question was mainly asking about the efficiency of different techniques. Do you believe that this would be significantly faster than using a hash dataset and/or an in statement? To expand on my initial post, which of the below methods would be fastest? Also, if you could direct me to any literature as to why, that would be amazing. For context, set will have well over 1,000,000 variables (perhaps 10,000,000) and there are probably 50 variables that need to be checked in a way similar to var1. The list to be searched is also much longer than just ("ABC" "DEF" "GHI") and contains at least 30 strings. Examples data want_in_method;
set have;
length flag 3.;
if var1 in ("ABC" "DEF" "GHI") then flag = 1;
else flag = 0;
run;
data want_hash_method;
set have;
length flag 3.;
if _n_ = 1 then do;
declare hash h(dataset:"string_list");
h.DefineKey("var1");
h.DefineDone();
end;
if h.find() = 0 then flag = 1;
else flag = 0;
run;
data want_find_method;
set have;
length flag 3.;
if find("ABC DEF GHI", var1, "i") then flag = 1;
else flag = 0;
run;
... View more