Hello,
I am using the script below to find certain codes in a very large dataset with hash-tables. Now I have to write the exact code for the dcode I would like to find (e.g: 'I412' or 'I519') for the scan-function. Instead I would like to find every code that starts with 'I41' or 'I51' with the help of hash-tables using colon (:) modifier? Is that possible?
data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 I410 J332
2 C450 I412
3 I413 R789
4 R281 I519
5 O603 C351
;
run;
data Codes;
length code $4.;
input code $ ;
datalines;
I410
I412
I413
I519
;
run;
DATA Code_scan;
length code $4.;
if _N_=1 then do;
declare hash P(dataset: 'Codes');
P.defineKey('code');
P.defineData('code');
P.defineDone();
end;
set
have
;
length Findcode $4;
do i=1 to 30 by 1;
Findcode=scan(Dcode,i,'');
if Findcode='' then leave;
if P.find(key:Findcode) = 0
then do;
output;
leave;
end;
end;
run;
I checked with a colleague who handles the hash object and you can't use a wild card, still there are other ways and this one works.
data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 I410 J332
2 C450 I412
3 I413 R789
4 R281 I519
5 O603 C351
6 3456 I510
7 I444 2345
8 I418 6789
9 I509 1234
10 9876 I412
;
run;
data Codes;
length code $4.;
input code $ ;
datalines;
I410
I412
I413
I519
;
run;
data tmp;
set codes;
chk=substr(code,1,3);
run;
DATA Code_scan;
length chk $3.;
if _N_=1 then do;
declare hash P(dataset: 'tmp');
P.defineKey('chk');
P.defineData('chk');
P.defineDone();
end;
set have ;
length Findcode $4;
do i=1 to 30 by 1;
Findcode=substr(scan(Dcode,i,''),1,3);
if Findcode='' then leave;
if P.find(key:Findcode) = 0
then do;
output;
leave;
end;
end;
run;
proc print;run;
I don't understand . Hash Table perform the exact match . If you want fuzz match , try other way like PRX.
data have; length id $10 dcode $48; input id$ dcode$ &; datalines; 1 I410 J332 2 C450 I412 3 I413 R789 4 R281 I519 5 O603 C351 ; run; data want; set have; if prxmatch('/\b(I41|I51)/i',dcode); run;
I use hash-tables to save time - so I would really like to use hash-tables.
Here's another Perl Regular Expression example, I basically hacked this example in the documentation
data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 I410 J332
2 C450 I412
3 I413 R789 I412 I4123
4 R281 I519 I413 I51698
5 O603 C351
;
run;
data want ;
retain re 0 ;
if _n_=1 then do ;
re=prxparse('/I41\w*|I51\w*/') ;
end ;
set have ;
x=prxmatch(re,dcode) ;
start=1 ;
stop=length(dcode) ;
call prxnext(re,start,stop,dcode,position,length) ;
do while(position>0) ;
found=substr(dcode,position,length) ;
call prxnext(re,start,stop,dcode,position,length) ;
put found= ;
output ;
end ;
run ;
I checked with a colleague who handles the hash object and you can't use a wild card, still there are other ways and this one works.
data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 I410 J332
2 C450 I412
3 I413 R789
4 R281 I519
5 O603 C351
6 3456 I510
7 I444 2345
8 I418 6789
9 I509 1234
10 9876 I412
;
run;
data Codes;
length code $4.;
input code $ ;
datalines;
I410
I412
I413
I519
;
run;
data tmp;
set codes;
chk=substr(code,1,3);
run;
DATA Code_scan;
length chk $3.;
if _N_=1 then do;
declare hash P(dataset: 'tmp');
P.defineKey('chk');
P.defineData('chk');
P.defineDone();
end;
set have ;
length Findcode $4;
do i=1 to 30 by 1;
Findcode=substr(scan(Dcode,i,''),1,3);
if Findcode='' then leave;
if P.find(key:Findcode) = 0
then do;
output;
leave;
end;
end;
run;
proc print;run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.