Solved: Re: How to obtain count on this table?

Ganeshk · Posted 12-28-2016 10:54 PM

Hi,

I have data table as shown below:

ID	Group	Sub Group	Flag
101	A	AB	1
110	A	AB	1
103	A	IJ	1
108	A	KL	1
501	B	AB	2
503	B	CD	2
507	B	EF	1
508	B	EF	1
509	B	EF	1
515	B	KL	2
511	B	KL	2

I need to count unique ID by Flag as shown below:

Flag-1
Group	Sub Group	count of ID
A	AB	2
A	IJ	1
A	KL	1

Flag-2
Group	Sub Group	count of ID
B	AB	1
B	CD	1
B	EF	3
B	KL	2

Thanks,

Ganesh K

mkeintz · Posted 12-29-2016 09:01 AM

The "distinct id" in Fareeza's proc sql keeps it to a single PROC. Otherwise, if you don't use a data step, then you probably need two proc's. Recursive proc freqs?

proc freq data=have noprint ;
  tables flag*group*subgroup*id / out=need;
run;
proc freq data=need noprint;
  table flag*group*subgroup / out=want (drop=percent rename=(count=n_ids));
run;

x

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

Jagadishkatam · Posted 12-28-2016 11:01 PM

data have;
input ID 	Group$ 	Sub_Group$ 	Flag;
cards;
101 	A 	AB 	1
110 	A 	AB 	1
103 	A 	IJ 	1
108 	A 	KL 	1
501 	B 	AB 	2
503 	B 	CD 	2
507 	B 	EF 	1
508 	B 	EF 	1
509 	B 	EF 	1
515 	B 	KL 	2
511 	B 	KL 	2
;

proc sort data=have;
by group sub_group flag;
run;

data flag1 flag2;
set have;
by group sub_group flag;
retain count;
if first.flag then count=1;
else count=count+1;
drop id;
if last.flag;
if flag=1 then output flag1;
else if flag=2 then output flag2;
run;

Thanks,
Jag

Reeza · Posted 12-29-2016 12:45 AM

When counting unique records, SQL is helpful.

Proc SQL;

create table want as

select flag, group, subgroup, count(distinct ID) as uniqueiDs

from have

group by flag, group, subgroup;

quit.

If you want the strucutre shown shown in your post, use a proc report or proc PRINT on the table from the query above.

Loko · Posted 12-29-2016 03:44 AM

Hello,

data have;
infile datalines;
input ID 	Group $	Sub_Group $	Flag;
datalines;
101 A AB 1
110 A AB 1
103 A IJ 1
108 A KL 1
501 B AB 2
503 B CD 2
507 B EF 1
508 B EF 1
509 B EF 1
515 B KL 2
511 B KL 2
;

proc means noprint nway data=have;
class Group Sub_Group;
var flag;
output out=want(drop=_type_ _freq_) N(Flag)=;
run;

Reeza · Posted 12-29-2016 05:05 AM

Proc means cannot do a distinct count. If your actual data is like your sample data and you have no duplicate rows/repeat customers per flag/group/subgroup this will be fine.

mkeintz · Posted 12-29-2016 09:01 AM

The "distinct id" in Fareeza's proc sql keeps it to a single PROC. Otherwise, if you don't use a data step, then you probably need two proc's. Recursive proc freqs?

proc freq data=have noprint ;
  tables flag*group*subgroup*id / out=need;
run;
proc freq data=need noprint;
  table flag*group*subgroup / out=want (drop=percent rename=(count=n_ids));
run;

x

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Classroom Training Available!