BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
MB_Analyst
Obsidian | Level 7

I am trying to count observations with two grouping variables, with the output include counts of zero in groups. I've look at this question but having troubling expanding it with two grouping variables.

 

DATA test_dat;
    INPUT ID group $1. varx;
    DATALINES;
    1 a 63
    1 a 67
    1 b 81
    1 b 63
    2 b 11
    2 b 63
    3 a 53
    3 b 83
    ;
RUN;

I want my output to look like:

 

ID     group    count

1         a            2

1         b            2

2         a            0

2         b            2

3         a            1

3         b            1

 

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

DATA test_dat;
    INPUT ID group $1. varx;
    DATALINES;
    1 a 63
    1 a 67
    1 b 81
    1 b 63
    2 b 11
    2 b 63
    3 a 53
    3 b 83
    ;
RUN;

proc sql;
create table want as
select a.*,coalesce(count,0) as count
from 
((select * from (select distinct id from test_dat),(select distinct group from test_dat))) a
left join
(select id, group,count(*) as count from test_dat group by id,group) as  b
on a.id=b.id and a.group=b.group
order by a.id,b.group;
quit;

View solution in original post

7 REPLIES 7
novinosrin
Tourmaline | Level 20

DATA test_dat;
    INPUT ID group $1. varx;
    DATALINES;
    1 a 63
    1 a 67
    1 b 81
    1 b 63
    2 b 11
    2 b 63
    3 a 53
    3 b 83
    ;
RUN;

proc sql;
create table want as
select a.*,coalesce(count,0) as count
from 
((select * from (select distinct id from test_dat),(select distinct group from test_dat))) a
left join
(select id, group,count(*) as count from test_dat group by id,group) as  b
on a.id=b.id and a.group=b.group
order by a.id,b.group;
quit;
RW9
Diamond | Level 26 RW9
Diamond | Level 26

There isn't a way for the procedure to know that data which doesn't exist in the dataset should be returned.  The simplest method is to create a template of the data you want to see out, then merge that on:

data test_dat;
  input id group $1. varx;
datalines;
1 a 63
1 a 67
1 b 81
1 b 63
2 b 11
2 b 63
3 a 53
3 b 83
;
run;

data templ;
  do id=1 to 3;
    do group="a","b";
      output;
    end;
  end;
run;

proc sql;
  create table want as 
  select coalesce(a.id,b.id) as id,
         coalesce(a.group,b.group)  as group,
         coalesce(a.res,0) as result
  from   (select id,group,count(*) as res from test_dat group by id,group) a
  full join templ b
  on     a.id=b.id
  and    a.group=b.group;
quit;
Andygray
Quartz | Level 8

@RW9   You are hardcoding the datastep to get all groups as opposed to @novinosrin  solution that cleverly chooses the distinct group*distinct id  in the subquery making it fully dynamic.  

Astounding
PROC Star

Only because I'm battling a horrible cold and it helps pass the time to address questions here ... here is an approach that might add to your arsenal of tools.

 

Begin by creating a shell of all possible combinations:

 

proc sql;
   create table all_ids as select distinct id from have;
   create table all_groups as select distinct group from have;
   create table complete_shell as select * from all_ids, all_groups orderby id, group;
quit;

Then get your counts and merge it in with the complete shell:

 

proc freq data=have;
   tables id * group / out=counts (keep=id group count) noprint;
run;

data want;
   merge counts complete_shell;
   by id group;
   if count=. then count=0;
run;
ballardw
Super User

And a solution using Proc Summary/means:

 

proc summary data=test_dat completetypes nway;
   class id group;
   output out=junk (drop=_type_ rename=(_freq_=count));
run;

I believe this would more easily scale to 3 or more variables than the SQL solution.

Ksharp
Super User
DATA test_dat;
    INPUT ID group $1. varx;
    DATALINES;
    1 a 63
    1 a 67
    1 b 81
    1 b 63
    2 b 11
    2 b 63
    3 a 53
    3 b 83
    ;
RUN;

proc freq data=test_dat noprint;
table id*group/out=want sparse ;
run;

proc print;run;
Ronein
Meteorite | Level 14

 

DATA Rawdata;
    INPUT ID group $1. y;
    DATALINES;
    1 a 63
    1 a 67
    1 b 81
    1 b 63
    2 b 11
    2 b 63
    3 a 53
    3 b 83
    ;
RUN;


data Sheled;
do id=1,2,3;
do group="a","b";
output;
end;
end;
run;

PROC TABULATE DATA=Rawdata classdata=Sheled format=comma32.  out=want1;
    var y;
	CLASS ID /	ORDER=UNFORMATTED MISSING;
	CLASS group /	ORDER=UNFORMATTED MISSING;
	Table ID=''*group='' ALL={LABEL="Total"},
          Y=''*N='count'/box='ID,Group';
RUN;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 13135 views
  • 5 likes
  • 8 in conversation