Hi, Please help me write code to do this task.
I have a list of bigram's that I want to check how many time they are occurring in mr_name field and output the counter.
For example. first row in mr_name "cxhimgqabout" has "cx" and "gq" two bigrams. So, my counter variable should say 2.
| list of bigram | mr_name | counter |
| bq | cxhimgqabout | 2 |
| bz | lolfvchickbznicefx | 3 |
| cf | howmefvtoldbqcv | 3 |
| cj | toldyounothing | 0 |
| cv | ||
| cx | ||
| fq | ||
| fv | ||
| fx | ||
| fz | ||
| gq |
Pretty straightforward and linear-->
data have;
infile cards truncover
input bigram $ mr_name :$20. ;
cards;
bq cxhimgqabout 2
bz lolfvchickbznicefx 3
cf howmefvtoldbqcv 3
cj toldyounothing 0
cv
cx
fq
fv
fx
fz
gq
;
proc transpose data=have out=w;
var bigram;
run;
data want;
set have;
if _n_=1 then set w;
array t col:;
count=0;
do i=1 to dim(t);
count+count(mr_name,strip( t(i)));
end;
drop col: i;
run;
I'm sorry I should have mentioned that bigrams variable and mr_name variable are in 2 different datasets. Do I need to join them to do this?
OK @helloSAS No worries, Going forward please detail your question well. Just a request, coz it only helps and saves our time
/*create two datasets bigram and name*/
data bigram(keep=bigram) name(keep=Mr_name);
infile cards truncover;
input bigram $ mr_name :$20. ;
if not missing(mr_name) then output name;
output bigram;
cards;
bq cxhimgqabout 2
bz lolfvchickbznicefx 3
cf howmefvtoldbqcv 3
cj toldyounothing 0
cv
cx
fq
fv
fx
fz
gq
;
proc transpose data=bigram out=w(drop=_name_);
var bigram;
run;
data want;
set name;
if _n_=1 then set w;
array t col:;
count=0;
do over t;
count+count(mr_name,strip( t));
end;
drop col: ;
run;
hello @novinosrin, based on my recent learning, the implicit arrays are deprecated in the latest releases, currently being supported only for backward compatibility.
I am a Paul Dorfman aka Hashman and Pierre Gagnon aka PGstats wannabe. PD uses that till date, so will i. My orientation is straight, otherwise I would ask them out on a date even though they are grey haired men. 🙂 In love with their Brilliance
Good luck.
Can you post a link to the documentation stating that implicit arrays are deprecated?
The deprecation of implicit array processing is implicit (pun intended). Neither "do over" nor the _i_ referencing method are mentioned anymore in the current documentation.
data have;
input mr_name $19.;
cards;
cxhimgqabout
lolfvchickbznicefx
howmefvtoldbqcv
toldyounothing
;
run;
data bigram;
input bigram $;
cards;
bq
bz
cf
cj
cv
cx
fq
fv
fx
fz
gq
;
run;
data want(drop=bigram);
if _n_=1 then do;
if 0 then set bigram;
dcl hash H (dataset:'bigram') ;
h.definekey ("bigram") ;
h.definedata ("bigram") ;
h.definedone () ;
dcl hiter hh('h');
end;
set have;
do while(hh.next()=0);
count=sum(count,count(mr_name,strip(bigram)));
end;
run;
updated code, based on your new requirement.
data have;
input mr_name $19.;
cards;
cxhimgqabout
lolfvchickbznicefx
howmefvtoldbqcv
toldyounothing
;
run;
data bigram;
input bigram $;
cards;
bq
bz
cf
cj
cv
cx
fq
fv
fx
fz
gq
;
run;
proc transpose data =bigram out=tmp;
var bigram;
run;
data want;
if _n_ =1 then do;
set tmp;
end;
set have;
array bigram[11] $ col1-col11;
counter=0;
do i=1 to dim(bigram);
if index(mr_name,strip(bigram[i]) ) > 0 then counter=counter+1;
end;
drop col: i _name_;
run;
data have;
input mr_name $19.;
cards;
cxhimgqabout
lolfvchickbznicefx
howmefvtoldbqcv
toldyounothing
;
run;
data want;
set have;
array bigram[11] $ ('bq','bz','cf','cj','cv','cx','fq','fv','fx','fz','gq');
counter=0;
do i=1 to dim(bigram);
if index(mr_name,strip(bigram[i]) ) > 0 then counter=counter+1;
end;
drop bigram: i;
run;
Thank you!
data result;
set name;
count=0;
do i=1 to n;
set bigram nobs=n point=i;
if index(mr_name,strip(bigram))>0 then count=count+1;
end;
keep mr_name count;
run;
data have;
input mr_name $19.;
cards;
cxhimgqabout
lolfvchickbznicefx
howmefvtoldbqcv
toldyounothing
;
run;
data bigram;
input bigram $;
cards;
bq
bz
cf
cj
cv
cx
fq
fv
fx
fz
gq
;
run;
proc sql;
create table want as
select mr_name,count(bigram) as count
from have as a left join bigram as b
on mr_name contains strip(bigram)
group by mr_name;
quit;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.