BookmarkSubscribeRSS Feed
coolbeans
Fluorite | Level 6

Hi,

I'm trying to create  a missing table in which one of variables have multiple subcategories. I used this code but it only gives me the overall missing count and I want to see how much is missing in each section including the count and percent .  An example of the table I am trying to create is below. Thanks!

 

proc freq data=work.sleep;
tables educ*sleep /missing;
run;

 

 

 sleep hrs 
 5 hrs 6 hrs
high school count  (%missing)count (%missing)
college count(%missing)count (%missing)
graduate count (%missing)count (%missing)

 

 

 

4 REPLIES 4
Ksharp
Super User
Better to post your sample data and desired output to clarify your demand.
Here 'count' is the count of non-missing obs or missing obs ? and how do you calculaed %missing (=a/b how do you define a and b)?
ChrisNZ
Tourmaline | Level 20

You can fully control the contents of a cell if you create a format.

For example:

data CARS;
  set SASHELP.CARS;
  if mod(_N_,5) = 0 then LENGTH=.;
  keep ORIGIN TYPE LENGTH;
run;
  
proc sql;
  create table CNT as 
  select a.ORIGIN
       , b.TYPE                 as CARTYPE
       , sum(LENGTH is missing) as M 
       , count(*)               as N 
       , 'miss'                 as FMTNAME
       , 'N'                    as TYPE
       , monotonic()            as OBS
       , case when calculated M = calculated N
              then 'No records'
              else catx(' ', calculated M, 'missing out of' , calculated N, '=', put(calculated M /calculated N, percent.) )            
              end               as LABEL
  from (select unique ORIGIN from CARS) a
         full outer join
       (select unique TYPE   from CARS) b   
         on 1
         left join 
       CARS                             c
         on  a.ORIGIN = c.ORIGIN 
         and b.TYPE   = c.TYPE 
  group by 1, 2;
quit;

proc format cntlin=CNT(rename=(OBS=START));
run;

proc report data=CNT;
  columns CARTYPE ORIGIN, OBS ;
  define CARTYPE / group ;
  define ORIGIN  / across;
  define OBS     / sum format=miss. ' ';
run;  

ChrisNZ_0-1742359480768.png

 

 

Note:

 

case when calculated M = calculated N

should be

case when missing(c.ORIGIN) | missing(c.TYPE)

but this ruins the group by operation for some reason.

If someone knows why...

 

 

ChrisNZ
Tourmaline | Level 20

If someone knows why...

I answer my own question:

Testing c.ORIGIN and c.TYPE assesses observation-level values while testing calculated M and calculated N assesses summarised values.

 

Adding a presence flag to see whether a record is prsent  computes the counts properly:

 

data HAVE;
  retain FLG 1;
  set SASHELP.CARS;
  if mod(_N_,5) = 0               then LENGTH=.;
  if ORIGIN='Asia' and TYPE='SUV' then LENGTH=.;
run;
  
proc sql;
  create table COUNTS as 
  select a.ORIGIN
       , b.TYPE                 as CARTYPE
       , sum(LENGTH is missing) as M 
       , count(*)               as N 
       , 'miss'                 as FMTNAME
       , 'N'                    as TYPE
       , monotonic()            as OBS
       , case when sum(FLG) < 1
              then 'No records'
              else catx(' ', calculated M, 'missing out of' , calculated N, '=', put(calculated M /calculated N, percent.) )            
              end               as LABEL
  from (select unique ORIGIN from HAVE) a
         full outer join
       (select unique TYPE   from HAVE) b   
         on 1
         left join 
       HAVE                             c
         on  a.ORIGIN = c.ORIGIN 
         and b.TYPE   = c.TYPE 
  group by 1, 2;
quit;

proc format cntlin=COUNTS(rename=(OBS=START));
run;

proc report data=COUNTS;
  columns CARTYPE ORIGIN, OBS;
  define CARTYPE / group ;
  define ORIGIN  / across;
  define OBS     / sum format=miss. ' ';
run;  

 

 

 

PaigeMiller
Diamond | Level 26

@coolbeans : If you absolutely have to have text in each cell like "count  (%missing)", and if you absolutely have to have redundant titles over the columns, then your programming will be more complicated and hard, as shown by @ChrisNZ .

 

If instead you can produce a table that looks something like this (which contains the same information and in my opinion is easier to read)

 

PaigeMiller_0-1742386149983.png

 

then this is a lot easier to produce via PROC REPORT or PROC TABULATE without the complicated SQL. Please provide the information requested by @Ksharp so that we can provide example code.

 

--
Paige Miller

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 858 views
  • 0 likes
  • 4 in conversation