BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Chris_LK_87
Quartz | Level 8

Hello, 

 

I am using the script below to find certain codes in a very large dataset with hash-tables. Now I have to write the exact code for the dcode I would like to find (e.g: 'I412' or 'I519') for the scan-function. Instead I would like to find every code that starts with 'I41' or 'I51' with the help of hash-tables using colon (:) modifier? Is that possible?

 

data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 I410 J332
2 C450 I412
3 I413 R789
4 R281 I519
5 O603 C351
;
run;

 

data Codes;
length code $4.;
input code $ ;
datalines;
I410
I412
I413
I519
;
run;

DATA Code_scan;
length code $4.;

if _N_=1 then do;

declare hash P(dataset: 'Codes');
P.defineKey('code');
P.defineData('code');
P.defineDone();
end;

set

have


;

length Findcode $4;

do i=1 to 30 by 1;
Findcode=scan(Dcode,i,'');
if Findcode='' then leave;
if P.find(key:Findcode) = 0
then do;
output;
leave;

end;
end;

run;

1 ACCEPTED SOLUTION

Accepted Solutions
AMSAS
SAS Super FREQ

I checked with a colleague who handles the hash object and you can't use a wild card, still there are other ways and this one works.

data have;
	length id $10 dcode $48;
	input id$ dcode$ &;
datalines;
1 I410 J332
2 C450 I412
3 I413 R789
4 R281 I519
5 O603 C351
6 3456 I510
7 I444 2345
8 I418 6789
9 I509 1234
10 9876 I412
;
run;



data Codes;
	length code $4.;
	input code $ ;
datalines;
I410
I412
I413
I519
;
run;
data tmp;
	set codes;
	chk=substr(code,1,3);
run;


DATA Code_scan;
	length chk $3.;
	if _N_=1 then do;
		declare hash P(dataset: 'tmp');
		P.defineKey('chk');
		P.defineData('chk');
		P.defineDone();
	end;
	set have ;
	length Findcode $4;
	do i=1 to 30 by 1;
		Findcode=substr(scan(Dcode,i,''),1,3);
		if Findcode='' then leave;
		if P.find(key:Findcode) = 0
		then do;
			output;
			leave;
		end;
	end;
run;

proc print;run;

View solution in original post

4 REPLIES 4
Ksharp
Super User

I don't understand . Hash Table perform the exact match . If you want fuzz match , try other way like PRX.

 

data have;
length id $10 dcode $48;
input id$ dcode$ &;
datalines;
1 I410 J332
2 C450 I412
3 I413 R789
4 R281 I519
5 O603 C351
;
run;

data want;
 set have;
 if prxmatch('/\b(I41|I51)/i',dcode);
 run;
Chris_LK_87
Quartz | Level 8

I use hash-tables to save time - so I would really like to use hash-tables. 

AMSAS
SAS Super FREQ

Here's another Perl Regular Expression example, I basically hacked this example in the documentation

 

data have;
	length id $10 dcode $48;
	input id$ dcode$ &;
datalines;
1 I410 J332
2 C450 I412
3 I413 R789 I412 I4123
4 R281 I519 I413 I51698
5 O603 C351 
;
run;

data want ;
	retain re 0 ;
	if _n_=1 then do ;
		re=prxparse('/I41\w*|I51\w*/') ;
	end ;
	set have ;
	x=prxmatch(re,dcode) ;
	start=1 ;
	stop=length(dcode) ;
	call prxnext(re,start,stop,dcode,position,length) ;
	do while(position>0) ;
		found=substr(dcode,position,length) ;
		call prxnext(re,start,stop,dcode,position,length) ;
		put found= ;
		output ;
	end ;
run ;

	
	
AMSAS
SAS Super FREQ

I checked with a colleague who handles the hash object and you can't use a wild card, still there are other ways and this one works.

data have;
	length id $10 dcode $48;
	input id$ dcode$ &;
datalines;
1 I410 J332
2 C450 I412
3 I413 R789
4 R281 I519
5 O603 C351
6 3456 I510
7 I444 2345
8 I418 6789
9 I509 1234
10 9876 I412
;
run;



data Codes;
	length code $4.;
	input code $ ;
datalines;
I410
I412
I413
I519
;
run;
data tmp;
	set codes;
	chk=substr(code,1,3);
run;


DATA Code_scan;
	length chk $3.;
	if _N_=1 then do;
		declare hash P(dataset: 'tmp');
		P.defineKey('chk');
		P.defineData('chk');
		P.defineDone();
	end;
	set have ;
	length Findcode $4;
	do i=1 to 30 by 1;
		Findcode=substr(scan(Dcode,i,''),1,3);
		if Findcode='' then leave;
		if P.find(key:Findcode) = 0
		then do;
			output;
			leave;
		end;
	end;
run;

proc print;run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 680 views
  • 0 likes
  • 3 in conversation