BookmarkSubscribeRSS Feed
rmartins
Calcite | Level 5

I'm using PROC COMPARE to compare two datasets with a common record ID. I want to create a new dataset that lists all variables and contains a count of the number of records where the value is not equal.

Basically the output will look like this, with a record for every variable (or at least every variable that is different in the two recordsets):

VariableRecords Not Equal
Field 110
Field 215

The proc compare statement below comes close, but the dataset created by outstats only has 68 records (there are several hundred variables in the original dataset).

proc compare

  base=old

  compare=new

  outstats=differences (where=(_type_ = 'NDIF'))

  transpose;

  id record_id;

run;

After running this, I see almost exactly what I want in the report in the results viewer but I can't figure out how to get this into a dataset.

Thanks for your help,

Rob

4 REPLIES 4
Jagadishkatam
Amethyst | Level 16

did you try using the out=datasetname in proc compare

Thanks,
Jag
rmartins
Calcite | Level 5

thanks Jagadishkatam. out=datasetname creates a dataset that contains a record for every record different between the datasets. What I want is a summary of the differences grouped by variable. outstats= comes close, but for some reason only outputs some of the variables.

Haikuo
Onyx | Level 15

I don't think there is a straightforward solution, but ODS may work it out, only it needs some additional work, say the sample below.

data h1 h2;

     set sashelp.class;

run;

data h2;

     set h2;

     if ranuni(1)>0.5 then

           age=age+1;

     if ranuni(2) >0.5 then

           weight=weight+2;

     if ranuni(3) >0.5 then

           height=height+3;

run;

ods output CompareSummary=want1;

proc compare

     base=h1

     compare=h2;

     id name;

run;

ods output close;

data want;

     set want1 (firstobs=35);

     if not missing(batch);

     name_var=scan(batch,1);

     n_diff=scan(batch,4);

     keep name_var n_diff;

run;

You may need to play with some parameters (firstobs=, etc) to make it work for you.

Good Luck,

Haikuo

rmartins
Calcite | Level 5

Thanks Hai.Kuo. That's clever and I'll play around with it a bit. outstats= is so close to what I want I was hoping there would be a way to tweak that to make it work, but maybe ODS is the answer.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1032 views
  • 0 likes
  • 3 in conversation