BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Satori
Quartz | Level 8

I have a list of 50 variables.

I did a proc freq to do counts of missing/not missing values for each variable individually.

I want to do counts of missing/not missing values for multiple variables simultaneously.

I think I need to use a macro.

 

What I want is something like this:

 

proc format; value missfmt .,0 ='Missing' other='Not Missing';
proc freq data=have; format var1 missfmt.; tables var1 / missing missprint nocum;

 

then I want to add var2 and do a proc freq but the not missing would be that both var1 and var2 are not missing while missing would be that either var1 or var2 is missing.

 

then I want to add var3, and so on until var50.

 

My idea is to see how the number of observations decreases as one extra variable is added.

 

Any suggestions on how to do this? 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

First point is that your test is not whether the values is missing or not.  Instead you are testing if the value is TRUE or not.  SAS will treat missing or zero as FALSE and any other value as TRUE.

 

So you want to count the number of observations that have TRUE values for V1 and then TRUE values for V1 and V2.  Looks like a simple candidate for a DO loop.

 

Let's make up some data and try it.

data have ;
  input v1-v4 ;
cards;
1 2 3 4
1 0 6 7
. 8 9 10
1 2 0 5
6 7 8 9
;

data want;
  set have end=eof;
  array vars v1-v4;
  array nonmiss[4] ;
  do index=1 to dim(vars) while (vars[index]);
    nonmiss[index]+1;
  end;
  if eof;
  keep nonmiss: ;
run;

Results:

OBS    nonmiss1    nonmiss2    nonmiss3    nonmiss4

 1         4           3           2           2

View solution in original post

5 REPLIES 5
PaigeMiller
Diamond | Level 26

If you have to use PROC FREQ, then I agree that you will need a macro for this. However, PROC FREQ isn't needed, you can do most of this in a data step with ARRAYs. If you just add the values of var1+var2, the value will be missing if either var1 or var2 will is missing. So this gives you the consecutive ability to count missing across all columns. This code is UNTESTED. You will likely get a lot of notes or warnings in the log about missing values.

 

data temporary;
    set have;
    array v var1-var50;
    do i=2 to dim(v);
        do j=1 to i-1;
             v(i)=v(i)+v(j);
         end;
    end;
    drop i j;
run;

proc summary data=temporary;
     var v1-v50;
     output out=missings nmiss=/autoname;
run;

 

 

If you need to count zero as missing, you would add this as the first line in the DO J loop

 

if v(j)=0 then call missing(v(j));

 

--
Paige Miller
s_lassen
Meteorite | Level 14

You can get the result you want something like this;

data want;                                                                                                                              
  array counts(*) count1-count50;                                                                                                        
  retain count1-count50 0;                                                                                                               
  set have end=done;                                                                                                                    
  array vars(*) var1-var50;                                                                                                              
  do i=1 to dim(vars) while(vars(i) not in(.,0));                                                                                       
    counts(i)+1;                                                                                                                        
    end;                                                                                                                                
  if done;                                                                                                                              
  total=_N_;                                                                                                                            
  keep total count1-count50;                                                                                                             
run;

There will be just one row in the output. The variable TOTAL will contain the total number of observations, the variables COUNT1-COUNT50 will contain the number of observations with no missing values for 1,2... up to all 50 variables. If you want the number of observations with missing values as well, subtract the counts from the total.

Tom
Super User Tom
Super User

First point is that your test is not whether the values is missing or not.  Instead you are testing if the value is TRUE or not.  SAS will treat missing or zero as FALSE and any other value as TRUE.

 

So you want to count the number of observations that have TRUE values for V1 and then TRUE values for V1 and V2.  Looks like a simple candidate for a DO loop.

 

Let's make up some data and try it.

data have ;
  input v1-v4 ;
cards;
1 2 3 4
1 0 6 7
. 8 9 10
1 2 0 5
6 7 8 9
;

data want;
  set have end=eof;
  array vars v1-v4;
  array nonmiss[4] ;
  do index=1 to dim(vars) while (vars[index]);
    nonmiss[index]+1;
  end;
  if eof;
  keep nonmiss: ;
run;

Results:

OBS    nonmiss1    nonmiss2    nonmiss3    nonmiss4

 1         4           3           2           2
Satori
Quartz | Level 8

How can I do this if I want to mix both numeric and character variables?

Reeza
Super User

FYI - this post may have some interesting visualization and analysis options for you to explore your patterns of missing data.

https://blogs.sas.com/content/iml/2017/11/29/visualize-patterns-missing-values.html

 

Spoiler

@Satori wrote:

I have a list of 50 variables.

I did a proc freq to do counts of missing/not missing values for each variable individually.

I want to do counts of missing/not missing values for multiple variables simultaneously.

I think I need to use a macro.

 

What I want is something like this:

 

proc format; value missfmt .,0 ='Missing' other='Not Missing';
proc freq data=have; format var1 missfmt.; tables var1 / missing missprint nocum;

 

then I want to add var2 and do a proc freq but the not missing would be that both var1 and var2 are not missing while missing would be that either var1 or var2 is missing.

 

then I want to add var3, and so on until var50.

 

My idea is to see how the number of observations decreases as one extra variable is added.

 

Any suggestions on how to do this? 


SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 859 views
  • 2 likes
  • 5 in conversation