Hi, Please help me write code to do this task.
I have a list of bigram's that I want to check how many time they are occurring in mr_name field and output the counter.
For example. first row in mr_name "cxhimgqabout" has "cx" and "gq" two bigrams. So, my counter variable should say 2.
list of bigram | mr_name | counter |
bq | cxhimgqabout | 2 |
bz | lolfvchickbznicefx | 3 |
cf | howmefvtoldbqcv | 3 |
cj | toldyounothing | 0 |
cv | ||
cx | ||
fq | ||
fv | ||
fx | ||
fz | ||
gq |
Pretty straightforward and linear-->
data have;
infile cards truncover
input bigram $ mr_name :$20. ;
cards;
bq cxhimgqabout 2
bz lolfvchickbznicefx 3
cf howmefvtoldbqcv 3
cj toldyounothing 0
cv
cx
fq
fv
fx
fz
gq
;
proc transpose data=have out=w;
var bigram;
run;
data want;
set have;
if _n_=1 then set w;
array t col:;
count=0;
do i=1 to dim(t);
count+count(mr_name,strip( t(i)));
end;
drop col: i;
run;
I'm sorry I should have mentioned that bigrams variable and mr_name variable are in 2 different datasets. Do I need to join them to do this?
OK @helloSAS No worries, Going forward please detail your question well. Just a request, coz it only helps and saves our time
/*create two datasets bigram and name*/
data bigram(keep=bigram) name(keep=Mr_name);
infile cards truncover;
input bigram $ mr_name :$20. ;
if not missing(mr_name) then output name;
output bigram;
cards;
bq cxhimgqabout 2
bz lolfvchickbznicefx 3
cf howmefvtoldbqcv 3
cj toldyounothing 0
cv
cx
fq
fv
fx
fz
gq
;
proc transpose data=bigram out=w(drop=_name_);
var bigram;
run;
data want;
set name;
if _n_=1 then set w;
array t col:;
count=0;
do over t;
count+count(mr_name,strip( t));
end;
drop col: ;
run;
hello @novinosrin, based on my recent learning, the implicit arrays are deprecated in the latest releases, currently being supported only for backward compatibility.
I am a Paul Dorfman aka Hashman and Pierre Gagnon aka PGstats wannabe. PD uses that till date, so will i. My orientation is straight, otherwise I would ask them out on a date even though they are grey haired men. 🙂 In love with their Brilliance
Good luck.
Can you post a link to the documentation stating that implicit arrays are deprecated?
The deprecation of implicit array processing is implicit (pun intended). Neither "do over" nor the _i_ referencing method are mentioned anymore in the current documentation.
data have;
input mr_name $19.;
cards;
cxhimgqabout
lolfvchickbznicefx
howmefvtoldbqcv
toldyounothing
;
run;
data bigram;
input bigram $;
cards;
bq
bz
cf
cj
cv
cx
fq
fv
fx
fz
gq
;
run;
data want(drop=bigram);
if _n_=1 then do;
if 0 then set bigram;
dcl hash H (dataset:'bigram') ;
h.definekey ("bigram") ;
h.definedata ("bigram") ;
h.definedone () ;
dcl hiter hh('h');
end;
set have;
do while(hh.next()=0);
count=sum(count,count(mr_name,strip(bigram)));
end;
run;
updated code, based on your new requirement.
data have;
input mr_name $19.;
cards;
cxhimgqabout
lolfvchickbznicefx
howmefvtoldbqcv
toldyounothing
;
run;
data bigram;
input bigram $;
cards;
bq
bz
cf
cj
cv
cx
fq
fv
fx
fz
gq
;
run;
proc transpose data =bigram out=tmp;
var bigram;
run;
data want;
if _n_ =1 then do;
set tmp;
end;
set have;
array bigram[11] $ col1-col11;
counter=0;
do i=1 to dim(bigram);
if index(mr_name,strip(bigram[i]) ) > 0 then counter=counter+1;
end;
drop col: i _name_;
run;
data have;
input mr_name $19.;
cards;
cxhimgqabout
lolfvchickbznicefx
howmefvtoldbqcv
toldyounothing
;
run;
data want;
set have;
array bigram[11] $ ('bq','bz','cf','cj','cv','cx','fq','fv','fx','fz','gq');
counter=0;
do i=1 to dim(bigram);
if index(mr_name,strip(bigram[i]) ) > 0 then counter=counter+1;
end;
drop bigram: i;
run;
Thank you!
data result;
set name;
count=0;
do i=1 to n;
set bigram nobs=n point=i;
if index(mr_name,strip(bigram))>0 then count=count+1;
end;
keep mr_name count;
run;
data have;
input mr_name $19.;
cards;
cxhimgqabout
lolfvchickbznicefx
howmefvtoldbqcv
toldyounothing
;
run;
data bigram;
input bigram $;
cards;
bq
bz
cf
cj
cv
cx
fq
fv
fx
fz
gq
;
run;
proc sql;
create table want as
select mr_name,count(bigram) as count
from have as a left join bigram as b
on mr_name contains strip(bigram)
group by mr_name;
quit;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.