DATA Step, Macro, Functions and more

Code to find missing values in all variables of an observation

Reply
Contributor msg
Contributor
Posts: 21

Code to find missing values in all variables of an observation

Hi All,

I have got a requirement like this.

I have to delete all observations which has missing values in all variables; irrespective of the datatype. This should be accomplished in a data step in a single statement, without using macro.

I tried with _all_. But, i guess The SAS System never allows this to be assigned to a variable. I also tried with cmiss & nmiss, but these 2 functions are specific to datatype. I tried with sum function which will throw a "invalid numeric data" note when character variable is encountered.

Help me if you know any logic.

Thanks
MSG
Super Contributor
Super Contributor
Posts: 3,174

Re: Code to find missing values in all variables of an observation

Yes, you can declare an ARRAY using the _ALL_ operand for the variable name and then test using the MISSING function in a DO / END loop. If all variables have a missing condition, then DELETE the observation.

Scott Barry
SBBWorks, Inc.
Contributor msg
Contributor
Posts: 21

Re: Code to find missing values in all variables of an observation

Thanks SBB for your response.

Yes, i tried implementing with an array. But, our array will accept only variables of the same datatype. So, i will not be able to define an array for this situation.

Thanks
MSG
Valued Guide
Posts: 2,177

Re: Code to find missing values in all variables of an observation

no array required.
Rather than re-write all your table, you can update-in-place with the modify statement and remove rows where the count of non-missing numerics is zero and the concatenation of all character columns is blank with:
data your.data ;
modify your.data ;
if n( of _numeric_ ) = 0 and cats( of _character_ ) = ' ' then remove ;
run ;
Does this do what is needed?
The data step creating a new table with the missing obs deleted is very similar
(a new dataset name in the data statement, replace "modify" with "set", and "remove" with "delete")
PeterC
Contributor msg
Contributor
Posts: 21

Re: Code to find missing values in all variables of an observation

Hi Peter,

Yes, that was one way.

But the function cats didnt accept it's arguments in this way. The arguement "of _character_" resulted in an error asking for enough arguments.

Thanks
MSG
Valued Guide
Posts: 2,177

Re: Code to find missing values in all variables of an observation

that systax error occurs when there are no character variables, and you indicated that _all_ involved more than one data type implying that there are character variables in that test case.
The solution only requires a small extension to that use of cats()
and cats( of _character_, ' ' )
Perhaps there needs to be a similar alteration to the use of the N() function
n( of _numeric_, . )
Contributor msg
Contributor
Posts: 21

Re: Code to find missing values in all variables of an observation

Thanks Peter & SBB for your responses.
Super Contributor
Posts: 359

Re: Code to find missing values in all variables of an observation

Cat will take numerics so use:
if compress(cat(of _all_, ". ")) = '' then delete;
Respected Advisor
Posts: 3,799

Re: Code to find missing values in all variables of an observation

Yes CAT does accept numerics but the technique won't work if the numerics have special missing values. Better to stick with Peter's suggestion.

OBS 4 below has all missing variable as does OBS 3.

[pre]
data test;
missing a b;
input a$ b c$ d e f;
*if compress(cats(of _all_), ". ") eq ' ' then flag=1;
if n(of _numeric_,.) eq 0 and cats(of _character_,' ') eq ' ' then flag=1;
cards;
a b c 1 2 3
2 a c 1 2 3
. . . . . .
. a . a b b
;;;;
run;
proc print;
run;
[/pre]
Super Contributor
Super Contributor
Posts: 3,174

Re: Code to find missing values in all variables of an observation

Posted in reply to data_null__
Sorry about the incorrect ARRAY info - yes, it is sensitive to SAS variable-type, so two arrays are necessary, as illustrated below:

data _null_;
retain a1-a5 ' ' n1-n5 0;
array allnvars (*) _numeric_;
array allcvars (*) _character_;
do i=1 to dim(allnvars);
putlog allnvars(i)= ;
end;
do i=1 to dim(allcvars);
putlog allcvars(i)= ;
end;
stop;
run;

Also, I find the other alternatives to be good choices as well.

Scott Barry
SBBWorks, Inc.
Ask a Question
Discussion stats
  • 9 replies
  • 394 views
  • 0 likes
  • 5 in conversation