Hello,
I am using the script below to find certain codes in a very large dataset with hash-tables. Now I have to write the exact code for the dcode I would like to find (e.g: 'I412' or 'I519') for the scan-function. Instead I would like to find every code that starts with 'I41' or 'I51' with the help of hash-tables using colon (:) modifier? Is that possible?
data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 I410 J332
2 C450 I412
3 I413 R789
4 R281 I519
5 O603 C351
;
run;
data Codes;
length code $4.;
input code $ ;
datalines;
I410
I412
I413
I519
;
run;
DATA Code_scan;
length code $4.;
if _N_=1 then do;
declare hash P(dataset: 'Codes');
P.defineKey('code');
P.defineData('code');
P.defineDone();
end;
set
have
;
length Findcode $4;
do i=1 to 30 by 1;
Findcode=scan(Dcode,i,'');
if Findcode='' then leave;
if P.find(key:Findcode) = 0
then do;
output;
leave;
end;
end;
run;
I checked with a colleague who handles the hash object and you can't use a wild card, still there are other ways and this one works.
data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 I410 J332
2 C450 I412
3 I413 R789
4 R281 I519
5 O603 C351
6 3456 I510
7 I444 2345
8 I418 6789
9 I509 1234
10 9876 I412
;
run;
data Codes;
length code $4.;
input code $ ;
datalines;
I410
I412
I413
I519
;
run;
data tmp;
set codes;
chk=substr(code,1,3);
run;
DATA Code_scan;
length chk $3.;
if _N_=1 then do;
declare hash P(dataset: 'tmp');
P.defineKey('chk');
P.defineData('chk');
P.defineDone();
end;
set have ;
length Findcode $4;
do i=1 to 30 by 1;
Findcode=substr(scan(Dcode,i,''),1,3);
if Findcode='' then leave;
if P.find(key:Findcode) = 0
then do;
output;
leave;
end;
end;
run;
proc print;run;
I don't understand . Hash Table perform the exact match . If you want fuzz match , try other way like PRX.
data have; length id $10 dcode $48; input id$ dcode$ &; datalines; 1 I410 J332 2 C450 I412 3 I413 R789 4 R281 I519 5 O603 C351 ; run; data want; set have; if prxmatch('/\b(I41|I51)/i',dcode); run;
I use hash-tables to save time - so I would really like to use hash-tables.
Here's another Perl Regular Expression example, I basically hacked this example in the documentation
data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 I410 J332
2 C450 I412
3 I413 R789 I412 I4123
4 R281 I519 I413 I51698
5 O603 C351
;
run;
data want ;
retain re 0 ;
if _n_=1 then do ;
re=prxparse('/I41\w*|I51\w*/') ;
end ;
set have ;
x=prxmatch(re,dcode) ;
start=1 ;
stop=length(dcode) ;
call prxnext(re,start,stop,dcode,position,length) ;
do while(position>0) ;
found=substr(dcode,position,length) ;
call prxnext(re,start,stop,dcode,position,length) ;
put found= ;
output ;
end ;
run ;
I checked with a colleague who handles the hash object and you can't use a wild card, still there are other ways and this one works.
data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 I410 J332
2 C450 I412
3 I413 R789
4 R281 I519
5 O603 C351
6 3456 I510
7 I444 2345
8 I418 6789
9 I509 1234
10 9876 I412
;
run;
data Codes;
length code $4.;
input code $ ;
datalines;
I410
I412
I413
I519
;
run;
data tmp;
set codes;
chk=substr(code,1,3);
run;
DATA Code_scan;
length chk $3.;
if _N_=1 then do;
declare hash P(dataset: 'tmp');
P.defineKey('chk');
P.defineData('chk');
P.defineDone();
end;
set have ;
length Findcode $4;
do i=1 to 30 by 1;
Findcode=substr(scan(Dcode,i,''),1,3);
if Findcode='' then leave;
if P.find(key:Findcode) = 0
then do;
output;
leave;
end;
end;
run;
proc print;run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.