Hi all - I am joining two tables, one of which has diagnostic events for patients and the other being a lookup table categorizing diagnosis codes as inclusions or exclusions. The first is potentially quite large and the 2nd is generally going to be fairly small and contains some complete diagnosis codes and some truncated ones that might have one or more '%' signs in them to make them compatible with the SQL LIKE operator. Here's an example: data claims;
infile cards dlm=',';
length ptid mpid 5 dx $10;
input ptid mpid dx;
cards;
1, 12, H41198
1, 15, C7781
2, 294, H677119
2, 300, G55181
;
run;
data codelist;
infile cards dlm=',';
length code_type $8 code $10 condition $15;
input code_type code condition;
cards;
DX_INCL,C%,CANCER
DX_EXCL,G55181,STUFF
;
run;
proc sql;
select a.ptid, a.mpid, a.dx, b.code_type, b.condition, b.code as lookup_code
from
claims A
inner join
codelist B
on a.dx LIKE trim(b.code); /* <<-- I would like to remove the TRIM function */
quit; The above works fine and produces output as expected: ptid mpid dx code_type condition lookup_code 1 15 C7781 DX_INCL CANCER C% 2 300 G55181 DX_EXCL STUFF G55181 ...however, being that 'LIKE' is abusive enough, processor-wise, when working with large tables, I would really like to be able to at least get rid of the TRIM( ) function in the join. But removing it from the join, even if I have pre-processed the CODELIST table like this: data codelist;
set codelist;
code=trim(code);
run; causes the SELECT to return only the exact match (code G55181) and not the C% code. Maybe TRIM/STRIP/COMPRESS and similar don't really add much processor time and I should just ignore, but I am really puzzled by this behavior. Any ideas would be appreciated! Thanks.
... View more