hi all; may you please assist. I have a dataset where the client column can have multiple clients. I need count the occurrences of the clients showing in the data.
Below is the example of dataset: Count is the data column needed here, ie, Client A appeared 4 times in the data while B showed 5 times.
Client | Count |
A | |
B | |
A,B | |
B,A | |
A,B,C | |
B | |
C | |
D |
data wanted:
Client | Count |
A | 4 |
B | 5 |
C | 2 |
D | 1 |
thanks ahead of time for your assistance.
Use an intermediate step to separate your client variable into 1 client per observation (using scan). Then use proc freq to count the instances of each client:
data have;
input client :$50.;
datalines;
A
B
A,B
B,A
A,B,C
B
C
D
;
run;
data temp;
set have;
i=1;
do until (scan(client, i, ',')='');
client_separated=compress(scan(client, i, ','));
output;
i+1;
end;
drop i client;
run;
proc freq data=temp noprint;
tables client_separated / out=want(drop=percent);
run;
Use an intermediate step to separate your client variable into 1 client per observation (using scan). Then use proc freq to count the instances of each client:
data have;
input client :$50.;
datalines;
A
B
A,B
B,A
A,B,C
B
C
D
;
run;
data temp;
set have;
i=1;
do until (scan(client, i, ',')='');
client_separated=compress(scan(client, i, ','));
output;
i+1;
end;
drop i client;
run;
proc freq data=temp noprint;
tables client_separated / out=want(drop=percent);
run;
data have;
input client :$50.;
datalines;
A
B
A,B
B,A
A,B,C
B
C
D
;
run;
data _null_ ;
if _n_=1 then do;
dcl hash H (ordered: "A") ;
h.definekey ("client") ;
h.definedata ("client","count") ;
h.definedone () ;
end;
set have(rename=client=_client) end=z;
length client $10;
do _n_=1 to countw(_client,',');
client=scan(_client,_n_,',');
if h.find() = 0 then count=sum(count,1);
else count=1;
h.replace();
end;
if z;
h.output(dataset:'want');
run;
proc print noobs;run;
data have;
input client :$50.;
datalines;
A
B
A,B
B,A
A,B,C
B
C
D
;
run;
data want;
do until(z);
set have end=z;
array cl(999999) $10 _temporary_;
array co(999999) _temporary_;
length temp $10;
do _n_=1 to countw(client,',');
temp=scan(client,_n_,',');
k=whichc(temp, of cl(*));
if k then co(k)=sum(co(k),1);
else do;
_iorc_+1;
cl(_iorc_)=temp;
co(_iorc_)=1;
end;
end;
end;
do _n_=1 to dim(cl)-cmiss(of cl(*));
client=cl(_n_);
Count=co(_n_);
output;
end;
keep client count;
run;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.