08-15-2014 12:57 PM
I'm using PROC COMPARE to compare two datasets with a common record ID. I want to create a new dataset that lists all variables and contains a count of the number of records where the value is not equal.
Basically the output will look like this, with a record for every variable (or at least every variable that is different in the two recordsets):
|Variable||Records Not Equal|
The proc compare statement below comes close, but the dataset created by outstats only has 68 records (there are several hundred variables in the original dataset).
outstats=differences (where=(_type_ = 'NDIF'))
After running this, I see almost exactly what I want in the report in the results viewer but I can't figure out how to get this into a dataset.
Thanks for your help,
08-18-2014 12:23 PM
thanks Jagadishkatam. out=datasetname creates a dataset that contains a record for every record different between the datasets. What I want is a summary of the differences grouped by variable. outstats= comes close, but for some reason only outputs some of the variables.
08-15-2014 02:13 PM
I don't think there is a straightforward solution, but ODS may work it out, only it needs some additional work, say the sample below.
data h1 h2;
if ranuni(1)>0.5 then
if ranuni(2) >0.5 then
if ranuni(3) >0.5 then
ods output CompareSummary=want1;
ods output close;
set want1 (firstobs=35);
if not missing(batch);
keep name_var n_diff;
You may need to play with some parameters (firstobs=, etc) to make it work for you.
08-18-2014 12:25 PM
Thanks Hai.Kuo. That's clever and I'll play around with it a bit. outstats= is so close to what I want I was hoping there would be a way to tweak that to make it work, but maybe ODS is the answer.