BookmarkSubscribeRSS Feed
coolbeans
Fluorite | Level 6

Hi,

I'm trying to create  a missing table in which one of variables have multiple subcategories. I used this code but it only gives me the overall missing count and I want to see how much is missing in each section including the count and percent .  An example of the table I am trying to create is below. Thanks!

 

proc freq data=work.sleep;
tables educ*sleep /missing;
run;

 

 

 sleep hrs 
 5 hrs 6 hrs
high school count  (%missing)count (%missing)
college count(%missing)count (%missing)
graduate count (%missing)count (%missing)

 

 

 

4 REPLIES 4
Ksharp
Super User
Better to post your sample data and desired output to clarify your demand.
Here 'count' is the count of non-missing obs or missing obs ? and how do you calculaed %missing (=a/b how do you define a and b)?
ChrisNZ
Tourmaline | Level 20

You can fully control the contents of a cell if you create a format.

For example:

data CARS;
  set SASHELP.CARS;
  if mod(_N_,5) = 0 then LENGTH=.;
  keep ORIGIN TYPE LENGTH;
run;
  
proc sql;
  create table CNT as 
  select a.ORIGIN
       , b.TYPE                 as CARTYPE
       , sum(LENGTH is missing) as M 
       , count(*)               as N 
       , 'miss'                 as FMTNAME
       , 'N'                    as TYPE
       , monotonic()            as OBS
       , case when calculated M = calculated N
              then 'No records'
              else catx(' ', calculated M, 'missing out of' , calculated N, '=', put(calculated M /calculated N, percent.) )            
              end               as LABEL
  from (select unique ORIGIN from CARS) a
         full outer join
       (select unique TYPE   from CARS) b   
         on 1
         left join 
       CARS                             c
         on  a.ORIGIN = c.ORIGIN 
         and b.TYPE   = c.TYPE 
  group by 1, 2;
quit;

proc format cntlin=CNT(rename=(OBS=START));
run;

proc report data=CNT;
  columns CARTYPE ORIGIN, OBS ;
  define CARTYPE / group ;
  define ORIGIN  / across;
  define OBS     / sum format=miss. ' ';
run;  

ChrisNZ_0-1742359480768.png

 

 

Note:

 

case when calculated M = calculated N

should be

case when missing(c.ORIGIN) | missing(c.TYPE)

but this ruins the group by operation for some reason.

If someone knows why...

 

 

ChrisNZ
Tourmaline | Level 20

If someone knows why...

I answer my own question:

Testing c.ORIGIN and c.TYPE assesses observation-level values while testing calculated M and calculated N assesses summarised values.

 

Adding a presence flag to see whether a record is prsent  computes the counts properly:

 

data HAVE;
  retain FLG 1;
  set SASHELP.CARS;
  if mod(_N_,5) = 0               then LENGTH=.;
  if ORIGIN='Asia' and TYPE='SUV' then LENGTH=.;
run;
  
proc sql;
  create table COUNTS as 
  select a.ORIGIN
       , b.TYPE                 as CARTYPE
       , sum(LENGTH is missing) as M 
       , count(*)               as N 
       , 'miss'                 as FMTNAME
       , 'N'                    as TYPE
       , monotonic()            as OBS
       , case when sum(FLG) < 1
              then 'No records'
              else catx(' ', calculated M, 'missing out of' , calculated N, '=', put(calculated M /calculated N, percent.) )            
              end               as LABEL
  from (select unique ORIGIN from HAVE) a
         full outer join
       (select unique TYPE   from HAVE) b   
         on 1
         left join 
       HAVE                             c
         on  a.ORIGIN = c.ORIGIN 
         and b.TYPE   = c.TYPE 
  group by 1, 2;
quit;

proc format cntlin=COUNTS(rename=(OBS=START));
run;

proc report data=COUNTS;
  columns CARTYPE ORIGIN, OBS;
  define CARTYPE / group ;
  define ORIGIN  / across;
  define OBS     / sum format=miss. ' ';
run;  

 

 

 

PaigeMiller
Diamond | Level 26

@coolbeans : If you absolutely have to have text in each cell like "count  (%missing)", and if you absolutely have to have redundant titles over the columns, then your programming will be more complicated and hard, as shown by @ChrisNZ .

 

If instead you can produce a table that looks something like this (which contains the same information and in my opinion is easier to read)

 

PaigeMiller_0-1742386149983.png

 

then this is a lot easier to produce via PROC REPORT or PROC TABULATE without the complicated SQL. Please provide the information requested by @Ksharp so that we can provide example code.

 

--
Paige Miller

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 749 views
  • 0 likes
  • 4 in conversation