BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Xiaoyi
Obsidian | Level 7

SAS experts, 

I have a simple question. I want the output of only the missing and not missing counts. Here are the code. how could I add the output into the code? thanks.

 

proc means data=A nmiss n ;
class year;
run;

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Depending on exactly what you want something like may be what you want IF you want the N and Nmiss of every numeric variable other than Year in your data set.

Proc means data=A nway;
   class year;
   output out=work.summary n= nmiss= /autoname;
run;

Statistics like N and Nmiss in proc means kind of need a numeric variable to get the statistics. For an output data set you need to supply output variable names. The /autoname will create a variable with the statistict, N and Nmiss appended to each numeric variable's name.

 

 

View solution in original post

11 REPLIES 11
PaigeMiller
Diamond | Level 26

PROC MEANS is not the way to do this.

 

data want;
  set a;
  if missing(year);
run;
 
 
--
Paige Miller
smantha
Lapis Lazuli | Level 10
Proc means data =a missing noprint;
Class vara;
Var varb;
Output out= test nmiss=;
Run;
The difference between nmiss and _freq_ gives you a count of non missing values.
Proc means data =a noprint;
By vara;
Var varb;
Output out= test nmiss=;
Run;
If data is sorted by vara
Reeza
Super User

There are two ways to do this. You can control the stats on the PROC MEANS statement and on the OUTPUT statement. Do you eventually want these in a data set?

 

*controls output in WANT1 data + displayed values;
proc means data=CLASS N NMISS;
ods output summary = want1;
class year;
*controls output in WANT2 data set but not displayed values;
output out = want2 N = N NMISS = NMISS;
run;

@Xiaoyi wrote:

SAS experts, 

I have a simple question. I want the output of only the missing and not missing counts. Here are the code. how could I add the output into the code? thanks.

 

proc means data=A nmiss n ;
class year;
run;


 

ballardw
Super User

Depending on exactly what you want something like may be what you want IF you want the N and Nmiss of every numeric variable other than Year in your data set.

Proc means data=A nway;
   class year;
   output out=work.summary n= nmiss= /autoname;
run;

Statistics like N and Nmiss in proc means kind of need a numeric variable to get the statistics. For an output data set you need to supply output variable names. The /autoname will create a variable with the statistict, N and Nmiss appended to each numeric variable's name.

 

 

Xiaoyi
Obsidian | Level 7

This works beautifully for all numeric variables. The output gives counts for missing and total number of observations by year (any categorical variables). 

Thanks. And Thank everyone who has responded. 

 

Xiaoyi
Obsidian | Level 7

Checking missing is a frequent practice for people who manage data. There is another post about count both numeric and character variables using PROC IML, but it gets a little complicated when you want the counts by certain categorical variables. So my approach is to us PROC MEANS count all numeric and then use other procedure to count character missings. Hope this help. 

Reeza
Super User
*set input data set name;
%let INPUT_DSN = class;
%let OUTPUT_DSN = want;
*create format for missing;

proc format;
    value $ missfmt ' '="Missing" other="Not Missing";
    value nmissfmt .="Missing" other="Not Missing";
run;

*Proc freq to count missing/non missing;
ods select none;
*turns off the output so the results do not get too messy;
ods table onewayfreqs=temp;

proc freq data=&INPUT_DSN.;
    table _all_ / missing;
    format _numeric_ nmissfmt. _character_ $missfmt.;
run;

ods select all;

proc print data=temp;run;

https://gist.github.com/statgeek/2de1faf1644dc8160fe721056202f111

 

PROC FREQ can handle both the numeric and character so if you want a single PROC that's a good option.

 


@Xiaoyi wrote:

Checking missing is a frequent practice for people who manage data. There is another post about count both numeric and character variables using PROC IML, but it gets a little complicated when you want the counts by certain categorical variables. So my approach is to us PROC MEANS count all numeric and then use other procedure to count character missings. Hope this help. 


 

Xiaoyi
Obsidian | Level 7

@Reeza Thank you for offering this solution. I run it and the results are little confusing. It doesn't look like something that I can work with.

I use the following code (see below) to count the numeric missing counts and it works beautifully, produce the results/output that I like. I want some code that can count character variables and produce similar outputs. My experience is that sometimes we try to accomplish all in one step and it ends up not getting what we want or it gets too complicated, if we break it down, we can just use some simple easy steps to achieve the same goal. And I only care about the counts of missings, we can leave the not missing counts out. Do you have other suggestions?

 

All numeric missing with output by categorical variable FYI

proc means data=sensit_dist nway noprint;
class infyob_yot;
output out=misschk n= nmiss= /autoname;
run;

Reeza
Super User
*create fake data;
data class;
    set sashelp.class;

    if age=14 then
        call missing(height, weight, sex);

    if name='Alfred' then
        call missing(sex, age, height);
    label age="Fancy Age Label";
run;

*create missing format;
proc format;
    value $ missfmt ' '="Missing" other="Not Missing";
    value nmissfmt .="Missing" other="Not Missing";
run;

proc freq data=class;
    table _all_ / missing;
    format _numeric_ nmissfmt. _character_ $missfmt.;
run;

It is a more complicated solution but it's also more flexible. An often required statistic is what percentage of data is missing, not just the Ns. Or in a format that can be included in a journal like N(%) 8(25%). 

 

There's many ways to count missing and you'd find even in my GitHub pages several ways to count missing. This is another way that generalizes to almost any data set. 

 

 

 

smantha
Lapis Lazuli | Level 10

One drawback with this approach is if you have class variables with missing values it will not be captured in summary.

ballardw
Super User

@smantha wrote:

One drawback with this approach is if you have class variables with missing values it will not be captured in summary.


Add the option /missing to the Class statement. Then the missing is treated as a value for generating the output.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 910 views
  • 4 likes
  • 5 in conversation