%let lkup = %str('250' 'E109' 'E111');
%let lkup2 = %str('27889');
Are you after something like below?
data work.small;
input event :$5.;
cards;
250
E109
E111
27889
;
run;
libname sample 'c:\temp';
data subset;
length flag $1;
if _n_ = 1 then
do;
if 0 then set work.small(keep=event);
declare hash h(dataset:'small');
h.defineKey('event');
h.defineDone();
end;
call missing(of _all_);
set sample.sample_qc_diab;
array _ev {*} cpt_4-cpt_13 dx1-dx10;
do _i=1 to dim(_ev);
if h.find(key:_ev[_i])=0 then
do;
flag='Y';
leave;
end;
end;
/* if missing(event) then flag='N';*/
if flag='Y' then output;
run;
libname sample clear;
How would a hash object help? What would it contain? How would that reduce processing time?
Explain in words what it is that the program id doing:
Why are you skipping CPT1 to CPT3 ?
You can simplify your code a little.
%let lkup = '250' 'E109' 'E111' ;
%let lkup2 = '27889' ;
data xx;
set xyz.sample_qc_diab ;
array diag dx1-dx10;
array proc cpt_4-cpt_13 ;
flag=0;
do index=1 to dim(diag) while (not flag);
flag = diag[index] in (&lkup) ;
end;
do index=1 to dim(proc) while (not flag);
flag = proc[index] in (&lkup2) ;
end;
if flag ;
run;
@GPatel wrote:
These are considered to be keys '250' 'E109' 'E111' '27889' or look up table.
From big dataset, look for these keys in dx1-dx10 or cpt4-cpt13. If you encounter key, create a flag as 1.
One can create small dataset with four keys '250' ,'E109', 'E111', '27889' , and use these keys to identify and isolate records where you see these values.
You could. But it would add extra processing time over the current method.
But it would allow two enhancements that might help.
1) The list of diagnosis codes could be larger than want can be put into a single macro variable.
2) You could possibly try to retrieve some other information from the lookup table. But that is not likely in this case since you are scanning multiple different variables with diagnosis codes so what possible auxiliary information could you retrieve from the lookup table?
If you want to store the diagnosis codes into a lookup table you could just add a step to pull them into the macro variable so that you could continue to use the existing data step to filter.
data codes;
input type $ code $ ;
cards;
ICD 250
ICD E109
ICD E111
CPT 27889
;
proc sql noprint;
select quote(trim(code))
into :lkup separated by ' '
from codes
where type='ICD'
;
select quote(trim(code))
into :lkup2 separated by ' '
from codes
where type='CPT'
;
quit;
@GPatel wrote:
Tom:
My goal is to re-engineer the process.
I would like to use HASH method. Is it feasible?
It could work. But it would be more code to maintain and require longer to run.
Essentially you would just end up replacing the IN operator with a call to the FIND() method.
"Is it possible to get the same results using HASH for efficiency?"
To me, the answer is two-fold. Yes, it is possible to do the same thing with a hash object. But for efficiency, I don't see how the hash object will do any good.
You want to re-engineer the process. How? What values would go into the hash object? That is the real question before we can answer if it even makes sense to use a hash object.
The hash object is very powerful and under the right circumstances, it can boost performance tremendously. However, it is not a magic wand that does wonders to the run-time of SAS processes whenever we use it 🙂
Best Regards Peter.
Are you after something like below?
data work.small;
input event :$5.;
cards;
250
E109
E111
27889
;
run;
libname sample 'c:\temp';
data subset;
length flag $1;
if _n_ = 1 then
do;
if 0 then set work.small(keep=event);
declare hash h(dataset:'small');
h.defineKey('event');
h.defineDone();
end;
call missing(of _all_);
set sample.sample_qc_diab;
array _ev {*} cpt_4-cpt_13 dx1-dx10;
do _i=1 to dim(_ev);
if h.find(key:_ev[_i])=0 then
do;
flag='Y';
leave;
end;
end;
/* if missing(event) then flag='N';*/
if flag='Y' then output;
run;
libname sample clear;
@GPatel wrote:
GM Patrick,
I like and accept your solution to my posting. It is what I was looking for, and it works like a charm. Appreciate your help and time.
One question to ask you : Can small data entries be a macro variable, and then use this macro variable in HASH reference as h.defineKey('event');?
Regards, GPatel
If I understand your question right then NO. You need a table/view as source for loading a hash table.
You can always create a macro variable with a string of values and then use that via an IN operator: variable in (¯o_var).
If you have a list of key/value pairs then you could also create a format using cntlin= and then use the format in syntax like: if not missint(put(variable,<format>)) then do...
(just define the other case as returning a blank value).
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.