Help using Base SAS procedures

how to find an "empty" observation?

Reply
New Contributor
Posts: 2

how to find an "empty" observation?


I have a dataset with about 6000 observations and over 1000 variables, some of them may have a value only for one variable - ID, how can I find these observations without listing ALL 1000 variables as missing in my code? something general?

Hope someone knows!!

Valued Guide
Posts: 765

Re: how to find an "empty" observation?

hi ... try CMISS (idea posted a few times by Linlin) ... if ID is never missing you can take that check out of the IF statement  ...

data x;

input id gender :$1. a b c;

datalines;

1 M 9 9 9

2 . . . .

3 F 9 . 9

. F . . .

;

* specify number of variables-1 for CMISS finction;

data y;

set x;

if ^missing(id) and cmiss(of _all_) eq  4;

run;


id    gender    a    b    c

2              .    .    .

Trusted Advisor
Posts: 1,018

Re: how to find an "empty" observation?

Try this (based on a recent SAS-L discussion):

data want;

  set have (drop=id);

  _n_=  ( n(of _numeric_)>0)  + (coalescec(of _character_)^=' ');

  set have (keep=id);

  if _n_;

run;

The revealing part of this (to me) was to learn that ID was not included when the "of _numeric_" and "of _character_" were compiled, so it did not interfere with the testing for all missing values.

Super User
Posts: 10,020

Re: how to find an "empty" observation?

one more way.

data x;
input id gender :$1. a b c;
datalines;
1 M 9 9 9
2 . . . .
3 F 9 . 9
. F . . .
;
run;


options missing=' ';
data want;
 set x;
 if cats(of _all_)=cats(id) ;
run;




Ksharp

PROC Star
Posts: 7,468

Re: how to find an "empty" observation?

And, you can take Ksharp's, or any of the suggestions, and incorporate the logic into code that separates the missing and non-missing records.  e.g.:

data have;

input id gender :$1. a b c;

datalines;

1 M 9 9 9

2 . . . .

3 F 9 . 9

. . . . .

. F . . .

;

run;

options missing=' ';

data want missing;

set have;

if missing(cats(of _all_))

  or cats(of _all_)=cats(id)

  then output missing;

else output want;

run;

Trusted Advisor
Posts: 1,018

Re: how to find an "empty" observation?

" Yes, using the "options missing=' '  in combination with a single "if missing(cats(of _all_))" is seductive,
   BUT ...

it only works with plain vanilla numeric missing values.  As a counter-example, consider that:

options missing=' ';

data _null_;

  x=.A;

  if missing(cats(of _all_)) then putlog 'missing';

  else putlog 'Not missing';

run;


will produce "Not missing" on the log.

As long as you have any numeric values I don't see how to efficiently avoid separately testing numerics and character variables for 100% missing-ness.

So I'll go back to my earlier suggestion with a minor modification to adapt to the possibility that the incoming data set night have only one type of variable.  Lets assume there are 3 ID variables that should be excluded from the testing.  Then:

data want (drop=_Smiley Happy;

   retain _chr ' '  _num .;

   set have;  

   call missing(id1, id2, id3);  ** Avoid false positives in the test for all missing **;

   _allmiss=   (N(of _numeric_)=0)) and (coalescec(of _character_)=' ');

    set have (keep=id1 id2 id3);   ** Go back and get the excluded variables **;

    if _allmiss;

run;

Changes:

  1. The leading RETAIN statement assures that there is at least one variables of each type, so that the N and COALESCEC functions don't produce a "no arguments" error condition.
  2. The first SET statement doesn't have the "drop=id1 id2 id3") option that I used earlier, because I don't want to change the order of variables in WANT.  Hence a "call missing" statement must precede the test for all missing numerics and characters.
Super User
Posts: 10,020

Re: how to find an "empty" observation?

Ha. It looks like we need some more code to be stronger .

data have;
input id gender :$1. a b c;
datalines;
1 M 9 9 9
2 . . .b ._
3 F 9 . 9
. . .a . .z
. F . . .
;
run;   
options missing=' ';
data want missing;
 set have;
 array _a{*} _numeric_;
 do _n_=1 to dim(_a) ;
  if missing(_a{_n_}) then call missing(_a{_n_});
 end;

if missing(cats(of _all_))
  or cats(of _all_)=cats(id)
  then output missing;
else output want;
run;
run;

Ksharp

Trusted Advisor
Posts: 1,018

Re: how to find an "empty" observation?

I don't see the benefit of this approach.  It confirms the assertion that numeric values have to be treated separately from character values, and it unneccesarily loses information by collapsing all numeric missing codes to a single value.

Contributor
Posts: 63

Re: how to find an "empty" observation?

What if i have 2 variables present and rest all missing and need to delete those type of observations.

For example gender and class should be excluded and need to check the rest all missing observations and delete those which have all missing values.

PROC Star
Posts: 7,468

Re: how to find an "empty" observation?

Posted in reply to venkatard

Pick the solution you want and just include a drop option.  e.g.:

data have;

input id gender :$1. class a b c;

datalines;

1 M 1 9 9 9

2 . 1 . . .

3 F 2 9 . 9

. . 3 . . .

. F 3 . . .

;

run;

options missing=' ';

data want missing;

set have (drop=gender class);

if missing(cats(of _all_))

  or cats(of _all_)=cats(id)

  then output missing;

else output want;

run;

New Contributor
Posts: 2

Re: how to find an "empty" observation?

Thanks to all of you for help.

Works nicely.

Ask a Question
Discussion stats
  • 10 replies
  • 1387 views
  • 0 likes
  • 6 in conversation