BookmarkSubscribeRSS Feed
rmartins
Calcite | Level 5

I'm using PROC COMPARE to compare two datasets with a common record ID. I want to create a new dataset that lists all variables and contains a count of the number of records where the value is not equal.

Basically the output will look like this, with a record for every variable (or at least every variable that is different in the two recordsets):

VariableRecords Not Equal
Field 110
Field 215

The proc compare statement below comes close, but the dataset created by outstats only has 68 records (there are several hundred variables in the original dataset).

proc compare

  base=old

  compare=new

  outstats=differences (where=(_type_ = 'NDIF'))

  transpose;

  id record_id;

run;

After running this, I see almost exactly what I want in the report in the results viewer but I can't figure out how to get this into a dataset.

Thanks for your help,

Rob

4 REPLIES 4
Jagadishkatam
Amethyst | Level 16

did you try using the out=datasetname in proc compare

Thanks,
Jag
rmartins
Calcite | Level 5

thanks Jagadishkatam. out=datasetname creates a dataset that contains a record for every record different between the datasets. What I want is a summary of the differences grouped by variable. outstats= comes close, but for some reason only outputs some of the variables.

Haikuo
Onyx | Level 15

I don't think there is a straightforward solution, but ODS may work it out, only it needs some additional work, say the sample below.

data h1 h2;

     set sashelp.class;

run;

data h2;

     set h2;

     if ranuni(1)>0.5 then

           age=age+1;

     if ranuni(2) >0.5 then

           weight=weight+2;

     if ranuni(3) >0.5 then

           height=height+3;

run;

ods output CompareSummary=want1;

proc compare

     base=h1

     compare=h2;

     id name;

run;

ods output close;

data want;

     set want1 (firstobs=35);

     if not missing(batch);

     name_var=scan(batch,1);

     n_diff=scan(batch,4);

     keep name_var n_diff;

run;

You may need to play with some parameters (firstobs=, etc) to make it work for you.

Good Luck,

Haikuo

rmartins
Calcite | Level 5

Thanks Hai.Kuo. That's clever and I'll play around with it a bit. outstats= is so close to what I want I was hoping there would be a way to tweak that to make it work, but maybe ODS is the answer.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1222 views
  • 0 likes
  • 3 in conversation