Re: Add missing values when counting with proc sql

wankata5 · Posted 06-07-2017 04:37 AM

Hi,

I've got a data set that looks like this (let's assume the grades are only A B C D):

Test Grades

1 A

1 B

1 A

1 C

1 B

1 A

1 C

1 B

1 A

I want to count how many grades are there for each type (including D) and the result to look like the following:

Test Grades Number

1 A 4

1 B 4

1 C 2

1 D 0

I am using the followin code:

proc sql;
create table want as
select test, grades, count(*) as number
from have
group by est, grades
order by est, grades;

quit;

And the problem is that this code doesn't add the row that shows there are 0 D.

RW9 · Posted 06-07-2017 04:56 AM

Yes, it will only count data which is present in the dataset, how can the procedure know that D should be there?

Do your calculation, then merge on a list of exepected values (to get working code post test data in the form of a datastep):

data default_vals;
  input grades $;
  test=1;
datalines;
A
B
C
D
;
run;

/* Your proc sql */

data want;
  merge want default_vals;
  by test grades;
  if number=. then number=0;
run;

Shmuel · Posted 06-07-2017 04:57 AM

In your data there are no "D" nor infinit number of other values.

As much as I know you cannot do what you want by sql.

You can do it by data step, deffining array of all expected codes to count intiated as 0,

At end of input you can print or output the counters. Some of them may contain zero.

data want;
 set have end=eof
       retain cnt c1-c5 0;
       ix = index('ABCD',grade);
       if ix le 5 then cnt(ix) +1;
       if eof then do;
          do ix=1 to dim(cnt);
              grade = substr('ABCD',ix,1);
              count = cnt(ix);
              output;
          end; 
       end;             
       keep grade count;
 run;

Jagadishkatam · Posted 06-07-2017 05:09 AM

data have;
input Test Grades$;
var=1;
cards;
1      A
1      B
1      B
1      A
1      C
1      B
1      A
1      C
1      B
1      A
;

proc format fmtlib;
value $grd
'A'='A'
'B'='B'
'C'='C'
'D'='D'
;
run;

proc summary data=have completetypes nway;
class test;
class  grades / preloadfmt;
var var;
format grades $grd.;
output out=want n=number;
run;

Thanks,
Jag

Ksharp · Posted 06-07-2017 09:27 AM

data have;
input Test Grades$;
var=1;
cards;
1      A
1      B
1      B
1      A
1      C
1      B
1      A
1      C
1      B
1      A
;

data x;
input x $;
cards;
A
B
C
D
;
run;

proc sql;
select a.test,a.x as grades,coalesce(sum(var),0) as count
 from (
 select * from
 (select distinct test from have),
 (select distinct x from x)
 ) as a left join have as b 
  on a.test=b.test and a.x=b.grades 
  group by a.test,a.x;
quit;

Ksharp · Posted 06-07-2017 09:31 AM

data have;
input Test Grades$;
cards;
1      A
1      B
1      B
1      A
1      C
1      B
1      A
1      C
1      B
1      A
;

data x;
input Grades $;
cards;
A
B
C
D
;
run;
data temp;
 set have(in=ina) x;
 w=ina;
run;
proc freq data=temp noprint;
table test*Grades/list out=want(where=(test is not missing)) ;
weight w/zeros;
run;

Add missing values when counting with proc sql