hi all; may you please assist. I have a dataset where the client column can have multiple clients. I need count the occurrences of the clients showing in the data.
Below is the example of dataset: Count is the data column needed here, ie, Client A appeared 4 times in the data while B showed 5 times.
| Client | Count |
| A | |
| B | |
| A,B | |
| B,A | |
| A,B,C | |
| B | |
| C | |
| D |
data wanted:
| Client | Count |
| A | 4 |
| B | 5 |
| C | 2 |
| D | 1 |
thanks ahead of time for your assistance.
Use an intermediate step to separate your client variable into 1 client per observation (using scan). Then use proc freq to count the instances of each client:
data have;
input client :$50.;
datalines;
A
B
A,B
B,A
A,B,C
B
C
D
;
run;
data temp;
set have;
i=1;
do until (scan(client, i, ',')='');
client_separated=compress(scan(client, i, ','));
output;
i+1;
end;
drop i client;
run;
proc freq data=temp noprint;
tables client_separated / out=want(drop=percent);
run;
Use an intermediate step to separate your client variable into 1 client per observation (using scan). Then use proc freq to count the instances of each client:
data have;
input client :$50.;
datalines;
A
B
A,B
B,A
A,B,C
B
C
D
;
run;
data temp;
set have;
i=1;
do until (scan(client, i, ',')='');
client_separated=compress(scan(client, i, ','));
output;
i+1;
end;
drop i client;
run;
proc freq data=temp noprint;
tables client_separated / out=want(drop=percent);
run;
data have;
input client :$50.;
datalines;
A
B
A,B
B,A
A,B,C
B
C
D
;
run;
data _null_ ;
if _n_=1 then do;
dcl hash H (ordered: "A") ;
h.definekey ("client") ;
h.definedata ("client","count") ;
h.definedone () ;
end;
set have(rename=client=_client) end=z;
length client $10;
do _n_=1 to countw(_client,',');
client=scan(_client,_n_,',');
if h.find() = 0 then count=sum(count,1);
else count=1;
h.replace();
end;
if z;
h.output(dataset:'want');
run;
proc print noobs;run;
data have;
input client :$50.;
datalines;
A
B
A,B
B,A
A,B,C
B
C
D
;
run;
data want;
do until(z);
set have end=z;
array cl(999999) $10 _temporary_;
array co(999999) _temporary_;
length temp $10;
do _n_=1 to countw(client,',');
temp=scan(client,_n_,',');
k=whichc(temp, of cl(*));
if k then co(k)=sum(co(k),1);
else do;
_iorc_+1;
cl(_iorc_)=temp;
co(_iorc_)=1;
end;
end;
end;
do _n_=1 to dim(cl)-cmiss(of cl(*));
client=cl(_n_);
Count=co(_n_);
output;
end;
keep client count;
run;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.