12-11-2014 11:16 AM
I've two datasets which are extracted from a master dataset of 295 variables. one among them is having missing values and the other not.
Now, I'd like to know all the variables list which are missing in base dataset but not missing in compared dataset. Basically I'm analyzing the data to know why few attributes are missing to understand our product activities.
Please help me with the query.
I've used compare procedure but not of great help. I need a list of variables that are different from other dataset.
12-11-2014 12:13 PM
There a several ways to interpret this request.
If you could provide a few lines of data that demonstrate what you have and what you would expect the outcome to be it would be helpful.
And what was not helpful about Proc Compare? You may only need additional options.
12-11-2014 12:43 PM
Agree, PROC COMPARE is probably helpful, e.g:
data class; set sashelp.class; if _n_=3 then do; name=''; height=.; end; run; ods output CompareSummary=summary; proc compare base=class compare=sashelp.class; run; ods output close; proc print data=summary; run;
The bummer is that PROC COMPARE has poor support for ODS output tables. So while the output shows what I think you want:
Variables with Unequal Values Variable Type Len Ndif MaxDif MissDif Name CHAR 8 1 1 Height NUM 8 1 0 1
The dataset from ods output is pretty ugly, just a character variable with a bunch of text in it:
Obs type batch 1 d 2 d 3 h Observation Summary 4 h 5 h Observation Base Compare 6 d 7 d First Obs 1 1 8 d First Unequal 3 3 9 d Last Unequal 3 3 10 d Last Obs 19 19 11 d 12 d Number of Observations in Common: 19. 13 d Total Number of Observations Read from WORK.CLASS: 19. 14 d Total Number of Observations Read from SASHELP.CLASS: 19. 15 d 16 d Number of Observations with Some Compared Variables Unequal: 1. 17 d Number of Observations with All Compared Variables Equal: 18. 18 d 19 d 20 h Values Comparison Summary 21 h 22 d Number of Variables Compared with All Observations Equal: 3. 23 d Number of Variables Compared with Some Observations Unequal: 2. 24 d Number of Variables with Missing Value Differences: 2. 25 d Total Number of Values which Compare Unequal: 2. 26 d Maximum Difference: 0. 27 d 28 h The COMPARE Procedure 29 h Comparison of WORK.CLASS with SASHELP.CLASS 30 h (Method=EXACT) 31 h 32 h Variables with Unequal Values 33 h 34 h Variable Type Len Ndif MaxDif MissDif 35 d 36 d Name CHAR 8 1 1 37 d Height NUM 8 1 0 1 38 d
12-11-2014 01:49 PM
Would this help?
if _n_=3 then do;
data matches nonmatches;
if _n_=1 then do;
declare hash compare(dataset: 'sashelp.class');
compare.definekey('name', 'height', 'weight'); /*define the key variables for comparison*/
/* compare.definedata(); no need*/
if compare.find()=0 then output matches;
else output nonmatches;