BookmarkSubscribeRSS Feed
helloSAS
Obsidian | Level 7

Hi, Please help me write code to do this task.

 

I have a list of bigram's that I want to check how many time they are occurring in mr_name field and output the counter.

 

For  example. first row in mr_name "cxhimgqabout" has "cx" and "gq" two bigrams. So, my counter variable should say 2.

 

list of bigram mr_name counter
bq cxhimgqabout 2
bz lolfvchickbznicefx 3
cf howmefvtoldbqcv 3
cj toldyounothing 0
cv    
cx    
fq    
fv    
fx    
fz    
gq    
14 REPLIES 14
novinosrin
Tourmaline | Level 20

Pretty straightforward and linear-->

 


data have;
infile cards truncover
input bigram $ mr_name :$20.  ;
cards;
bq	cxhimgqabout	2
bz	lolfvchickbznicefx	3
cf	howmefvtoldbqcv	3
cj	toldyounothing 0
cv	 	 
cx	 	 
fq	 	 
fv	 	 
fx	 	 
fz	 	 
gq
;

proc transpose data=have out=w;
var bigram;
run;
data want;
set have;
if _n_=1 then set w;
array t col:;
count=0;
do i=1 to dim(t);
count+count(mr_name,strip( t(i)));
end;
drop col: i;
run;
helloSAS
Obsidian | Level 7

 

I'm sorry I should have mentioned that bigrams variable and mr_name variable are in 2 different datasets. Do I need to join them to do this?

novinosrin
Tourmaline | Level 20

OK @helloSAS  No worries, Going forward please detail your question well. Just a request, coz it only helps and saves our time

 

/*create two datasets bigram and name*/
data bigram(keep=bigram) name(keep=Mr_name);
infile cards truncover;
input bigram $ mr_name :$20.  ;
if not missing(mr_name) then output name;
output bigram;
cards;
bq	cxhimgqabout	2
bz	lolfvchickbznicefx	3
cf	howmefvtoldbqcv	3
cj	toldyounothing 0
cv	 	 
cx	 	 
fq	 	 
fv	 	 
fx	 	 
fz	 	 
gq
;


proc transpose data=bigram out=w(drop=_name_);
var bigram;
run;
data want;
set name;
if _n_=1 then set w;
array t col:;
count=0;
do over t;
count+count(mr_name,strip( t));
end;
drop col: ;
run;
r_behata
Barite | Level 11

hello @novinosrin, based on my recent learning, the implicit arrays are deprecated in the latest releases, currently being supported only for backward compatibility. 

novinosrin
Tourmaline | Level 20

I am a Paul Dorfman aka Hashman and Pierre Gagnon aka PGstats wannabe. PD uses that till date, so will i.   My orientation is straight, otherwise I would ask them out on a date even though they are grey haired men. 🙂 In love with their Brilliance

 

 

 

andreas_lds
Jade | Level 19

Can you post a link to the documentation stating that implicit arrays are deprecated?

Kurt_Bremser
Super User

The deprecation of implicit array processing is implicit (pun intended). Neither "do over" nor the _i_ referencing method are mentioned anymore in the current documentation.

novinosrin
Tourmaline | Level 20
data have;
input mr_name $19.;
cards;
cxhimgqabout
lolfvchickbznicefx
howmefvtoldbqcv
toldyounothing
;
run;

data bigram;
input bigram $;
cards;
bq
bz
cf
cj
cv
cx
fq
fv
fx
fz
gq
;
run;

data want(drop=bigram);
if _n_=1 then do;
	if 0 then set bigram;
   dcl hash H (dataset:'bigram') ;
   h.definekey  ("bigram") ;
   h.definedata ("bigram") ;
   h.definedone () ;
   dcl hiter hh('h');
end;
set have;
do while(hh.next()=0);
count=sum(count,count(mr_name,strip(bigram)));
end;
run;
r_behata
Barite | Level 11

updated code, based on your new requirement.

 

data have;
input mr_name $19.;
cards;
cxhimgqabout
lolfvchickbznicefx
howmefvtoldbqcv
toldyounothing
;
run;

data bigram;
input bigram $;
cards;
bq
bz
cf
cj
cv
cx
fq
fv
fx
fz
gq
;
run;


proc transpose data =bigram out=tmp;
var bigram;
run;

data want;
if _n_ =1 then do;
	set tmp;
end;
	set have;
	
	array bigram[11]  $ col1-col11;
	counter=0;
	do i=1 to dim(bigram);
			if index(mr_name,strip(bigram[i]) ) > 0 then counter=counter+1;				
	end;
	drop col: i _name_;
run;
r_behata
Barite | Level 11
data have;
input mr_name $19.;
cards;
cxhimgqabout
lolfvchickbznicefx
howmefvtoldbqcv
toldyounothing
;
run;


data want;
	set have;
	
	array bigram[11]  $ ('bq','bz','cf','cj','cv','cx','fq','fv','fx','fz','gq');
	counter=0;
	do i=1 to dim(bigram);
			if index(mr_name,strip(bigram[i]) ) > 0 then counter=counter+1;				
	end;
	drop bigram: i;
run;
learsaas
Quartz | Level 8
data  result;
	set name;
	count=0;
	do i=1 to n;
		set bigram nobs=n point=i;
		if index(mr_name,strip(bigram))>0 then count=count+1;
	end;
	keep mr_name count;
run;
Ksharp
Super User
data have;
input mr_name $19.;
cards;
cxhimgqabout
lolfvchickbznicefx
howmefvtoldbqcv
toldyounothing
;
run;

data bigram;
input bigram $;
cards;
bq
bz
cf
cj
cv
cx
fq
fv
fx
fz
gq
;
run;
proc sql;
create table want as
 select mr_name,count(bigram) as count
  from have as a left join bigram as b
   on mr_name contains strip(bigram)
    group by mr_name;
quit;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 14 replies
  • 1326 views
  • 0 likes
  • 7 in conversation