A data step solution using keyed lookup may be faster.
If your data looks something like this:
data b;
input id (code_1-code_4) ($);
cards;
123 code1_A code2_B code3_W code4_D
124 code1_E code2_B code3_P code4_R
125 code1_H code2_J code3_K code4_E
126 code1_A code2_P code3_A code4_K
;run;
data a;
infile cards dsd delimiter=';' truncover;
length ID $2 code_1_list code_2_list code_3_list $20;
input ID--code_3_list;
cards4;
A1;'code1_A', 'code1_C';'code2_A', 'code2_K', 'code2_G';'code3_F'
A2;'code1_B', 'code1_C', 'code1_J';code2_A', 'code2_G';'code3_L'
A3;'code1_A';'code2_B', 'code2_P';'code3_F';
;;;;run;
You first transpose and index the codes in B:
data codes(index=(idx=(num code)) keep=num code);
set b;
array codes(*) code_:;
do num=1 to dim(codes);
code=codes(num);
output;
end; run;
Then you use keyed lookup to check the A dataset:
data want;
set a;
array lists(*) code_:;
do num=1 to dim(lists);
do _N_=1 to countw(lists(num));
code=scan(lists(num),_N_,"', ");
set codes key=idx/unique;
if not _iorc_ then do;
output;
_error_=0;
delete;
end;
end;
end;
_error_=0;
drop num code;
run;
The "_error_=0" statement comes in because whenever a code is not found, it generates an "error". Which you probably do not want to see in the log.
... View more