DATA Step, Macro, Functions and more

Missing values in a Data Set compared to another dataset

Reply
Contributor
Posts: 40

Missing values in a Data Set compared to another dataset

Hi,

I've two datasets which are extracted from a master dataset of 295 variables. one among them is having missing values and the other not.

Now, I'd like to know all the variables list which are missing in base dataset but not missing in compared dataset. Basically I'm analyzing the data to know why few attributes are missing to understand our product activities.

Please help  me with the query.

I've used compare procedure but not of great help. I need a list of variables that are different from other dataset.

Thanks,

Naresh

Super User
Posts: 11,338

Re: Missing values in a Data Set compared to another dataset

Posted in reply to NareshAbburi

There a several ways to interpret this request.

If you could provide a few lines of data that demonstrate what you have and what you would expect the outcome to be it would be helpful.

And what was not helpful about Proc Compare? You may only need additional options.

PROC Star
Posts: 1,322

Re: Missing values in a Data Set compared to another dataset

Posted in reply to NareshAbburi

Agree, PROC COMPARE is probably helpful, e.g:

data class;
  set sashelp.class;
  if _n_=3 then do;
    name='';
    height=.;
  end;
run;

ods output CompareSummary=summary;
proc compare base=class compare=sashelp.class;
run;
ods output close;

proc print data=summary;
run;

The bummer is that PROC COMPARE has poor support for ODS output tables.  So while the output shows what I think you want:

        Variables with Unequal Values

Variable  Type  Len  Ndif   MaxDif  MissDif

Name      CHAR    8     1                 1
Height    NUM     8     1        0        1

The dataset from ods output is pretty ugly, just a character variable with a bunch of text in it:

          Obs    type                                 batch

            1     d
            2     d
            3     h                            Observation Summary
            4     h
            5     h                       Observation      Base  Compare
            6     d
            7     d                       First Obs           1        1
            8     d                       First Unequal       3        3
            9     d                       Last  Unequal       3        3
           10     d                       Last  Obs          19       19
           11     d
           12     d      Number of Observations in Common: 19.
           13     d      Total Number of Observations Read from WORK.CLASS: 19.
           14     d      Total Number of Observations Read from SASHELP.CLASS: 19.
           15     d
           16     d      Number of Observations with Some Compared Variables Unequal: 1.
           17     d      Number of Observations with All Compared Variables Equal: 18.
           18     d
           19     d
           20     h                         Values Comparison Summary
           21     h
           22     d      Number of Variables Compared with All Observations Equal: 3.
           23     d      Number of Variables Compared with Some Observations Unequal: 2.
           24     d      Number of Variables with Missing Value Differences: 2.
           25     d      Total Number of Values which Compare Unequal: 2.
           26     d      Maximum Difference: 0.
           27     d
           28     h                           The COMPARE Procedure
           29     h                Comparison of WORK.CLASS with SASHELP.CLASS
           30     h                               (Method=EXACT)
           31     h
           32     h                       Variables with Unequal Values
           33     h
           34     h               Variable  Type  Len  Ndif   MaxDif  MissDif
           35     d
           36     d               Name      CHAR    8     1                 1
           37     d               Height    NUM     8     1        0        1
           38     d


--Q.

Frequent Contributor
Posts: 115

Re: Missing values in a Data Set compared to another dataset

Would this help?

data class;

  set sashelp.class;

  if _n_=3 then do;

    name='';

    height=.;

  end;

run;

data matches nonmatches;

     if _n_=1 then do;

          declare hash compare(dataset: 'sashelp.class');

          compare.definekey('name', 'height', 'weight'); /*define the key variables for comparison*/

         /* compare.definedata(); no need*/

         compare.definedone();

     end;

     set class;

     if compare.find()=0 then output matches;

     else output nonmatches;

run;

Naveen Srinivasan

L&T Infotech

Super User
Posts: 10,018

Re: Missing values in a Data Set compared to another dataset

Posted in reply to NareshAbburi

Post some sample data and the output you need .

Ask a Question
Discussion stats
  • 4 replies
  • 332 views
  • 7 likes
  • 5 in conversation