Help using Base SAS procedures

Count unique pipe delimited codes by group

Reply
N/A
Posts: 1

Count unique pipe delimited codes by group

I am new to SAS and have run into a problem that I cannot solve. I am trying to count the number of unique codes in a group. Each group consist of multiple rows and the codes are pipe delimited in each cell. A sample of my dataset looks like this:

GROUP CODES
1 |231|322|414|
1 |231|322|2 |231|
2 |231|114|
2
3
3

So in GROUP 1 there are 3 unique codes (231, 322, and 414) and in GROUP 2 there are 2 unique codes (231 and 114) There are 0 codes in GROUP 3. Can anyone tell me how to get SAS to give me an output dataset like this:

Group Count
1 3
2 2
3 0

Thanks in advance.
Chris
Super Contributor
Super Contributor
Posts: 3,174

Re: Count unique pipe delimited codes by group

Using a DATA step, read up the record using just an INPUT ; statement (no variables specified), then parse _INFILE_, first to get your GROUP, and then iterate with SCAN through each CODES that is present and do an OUTPUT for each. Then use PROC SORT NODUPKEY with BY on GROUP CODES, and then PROC SUMMARY to get your _FREQ_ count with BY GROUP.

Scott Barry
SBBWorks, Inc.

Suggested Google advanced search argument, this topic / post:

data step programming introduction site:sas.com Message was edited by: sbb
Valued Guide
Posts: 2,175

Re: Count unique pipe delimited codes by group

normalise it with proc transpose, so it is just
group, code
then
proc summary (or means or freq)
have a look in the online doc
Frequent Contributor
Posts: 101

Re: Count unique pipe delimited codes by group

Since you are new to SAS, here is one coded solution to your problem.

data sample;
length group 3 code 3;
infile cards dlm='|' truncover;
input
group
code @;
output;
do while ( not missing( code ));
input code @;
if not missing( code ) then output;
end;
cards;
1 |231|322|414|
1 |231|322|2 |231|
2 |231|114|
2
3
3
;
run;

proc sort nodupkey data=sample; * nodup option would also work since all variables are in by-group;
by group code;
run;

* Summary #1: Proc Means;
proc summary data=sample nway;
class group;
var code;
output out=summary (drop=_Smiley Happy n=count;
run;

* Summary #2: Proc SQL;
proc sql;
create table summary2 as
select
group,
sum( case
when not missing( code ) then 1
else 0
end ) as count
from
sample
group by 1
;
quit;
Ask a Question
Discussion stats
  • 3 replies
  • 137 views
  • 0 likes
  • 4 in conversation