BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Mushy
Quartz | Level 8

Hello,

 

In proc compare, is there a way to copy ONLY the variables names with difference in values between two tables.

 

Thanks,

Mushy

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

It is more complicated, but it's doable:

data cl1;
set sashelp.class;
run;

data cl2;
set sashelp.class;
if name = "John" then age = 13;
if name = "Jane" then sex = "M";
run;

proc compare
  data=cl1
  compare=cl2
  noprint
  out=comp
  outnoequal
  outstats=stat (where=(_type_ = "NDIF" and (_base_ ne 0)))
;
run;

proc transpose
  data=comp (drop=_type_)
  out=long1 (where=(_name_ ne "_OBS_" and not missing(col1)))
;
by _obs_;
var _numeric_;
run;

proc transpose
  data=comp (drop=_type_)
  out=long2 (where=(_name_ ne "_OBS_" and compress(col1,".") > " "))
;
by _obs_;
var _character_;
run;

proc sql;
create table want as
  select distinct _name_
  from long1
  union
  select distinct _name_
  from long2
;
quit;

I needed two transposes so that the exclusion conditions don't interfere with the other data type.

View solution in original post

12 REPLIES 12
sbxkoenk
SAS Super FREQ

Hello,

 

That question is not clear to me.

What do you mean with "COPY"?

 

This is how I use PROC COMPARE :

/* example usage of PROC COMPARE */
proc compare base=emp95_byidnum compare=emp96_byidnum
             out=result outnoequal outbase outcomp outdif
             outstats=diffstat noprint;
   id idnum;
run;

Thanks,

Koen

Kurt_Bremser
Super User

This will write only the variables with differences to the OUTSTATS= dataset:

data cl1;
set sashelp.class;
run;

data cl2;
set sashelp.class;
if name = "John" then age = 13;
run;

proc compare
  data=cl1
  compare=cl2
  noprint
  outstats=stat (where=(_type_ = "NDIF" and (_base_ ne 0)))
;
run;
SASJedi
SAS Super FREQ

Oh, I LIKE that 🙂 Very nice.

Check out my Jedi SAS Tricks for SAS Users
Kurt_Bremser
Super User

TBH, this is one of the questions that I deal with in Maxim 13 (see the Talmud quote). The OP's question made me play around with PROC COMPARE to find out how variables with differences show up in the output datasets.

My first try was at using the OUT= dataset, TRANSPOSE the numerics with non-missing values, and sort that with NODUPKEY, but then I thought of the statistics, and - lo and behold! - there's NDIF. Then it was just a simple WHERE=.

Quentin
Super User

This is nice.  If there's any spare development time at SAS for PROC COMPARE, would love to see it made more ODS compliant, i.e. ability to use the ODS OUTPUT statement to get usable datasets out of PROC COMPARE.

The Boston Area SAS Users Group (BASUG) is hosting our in person SAS Blowout on Oct 18!
This full-day event in Cambridge, Mass features four presenters from SAS, presenting on a range of SAS 9 programming topics. Pre-registration by Oct 15 is required.
Full details and registration info at https://www.basug.org/events.
Mushy
Quartz | Level 8

Hello Kurt,

 

Why does the "stat" table does not contain all variables in sashelp.class? It displays stat only for age,height and weight?

 

proc compare
data=cl1
compare=cl2
noprint
outstats=stat
;
run;

 

Thanks,

Mushy

ballardw
Super User

@Mushy wrote:

Hello Kurt,

 

Why does the "stat" table does not contain all variables in sashelp.class? It displays stat only for age,height and weight?

 

proc compare
data=cl1
compare=cl2
noprint
outstats=stat
;
run;

 

Thanks,

Mushy


When you look at the output data set a key indicator the _type_ column. Note what it contains: N, Mean, Max, Min, STD and a few other values. These statistics other than N would be meaningless for character variables. What would the "mean" of the Name variable look like? Or the standard deviation? These stats are only calculated for numeric values to have meaning.

Kurt_Bremser
Super User

@Mushy wrote:

Hello Kurt,

 

Why does the "stat" table does not contain all variables in sashelp.class? It displays stat only for age,height and weight?

 

proc compare
data=cl1
compare=cl2
noprint
outstats=stat
;
run;

 

Thanks,

Mushy


That is because the STAT= output datasets contains statistics that can only be applied to numeric values, so character variables do not appear there. If you run the compare without NOPRINT, you will see that the NAME and SEX variables are also compared:

Anzahl der verglichenen Variablen, bei denen alle Werte gleich sind: 4.                                 
Anzahl der verglichenen Variablen, bei denen einige Werte ungleich sind: 1.                             

(German output because of my locale, but you can see all 5 variables have been processed)

 

If you need to find character variables also, then you need to go the more complicated way by using an OUT= dataset and processing that.

Mushy
Quartz | Level 8

Hello Kurt,

 

Do you already have a code/method to get to know all the variables with differences with out= irrespective of the type? I looked into it and seems to be a bit complicated

 

 

Kurt_Bremser
Super User

It is more complicated, but it's doable:

data cl1;
set sashelp.class;
run;

data cl2;
set sashelp.class;
if name = "John" then age = 13;
if name = "Jane" then sex = "M";
run;

proc compare
  data=cl1
  compare=cl2
  noprint
  out=comp
  outnoequal
  outstats=stat (where=(_type_ = "NDIF" and (_base_ ne 0)))
;
run;

proc transpose
  data=comp (drop=_type_)
  out=long1 (where=(_name_ ne "_OBS_" and not missing(col1)))
;
by _obs_;
var _numeric_;
run;

proc transpose
  data=comp (drop=_type_)
  out=long2 (where=(_name_ ne "_OBS_" and compress(col1,".") > " "))
;
by _obs_;
var _character_;
run;

proc sql;
create table want as
  select distinct _name_
  from long1
  union
  select distinct _name_
  from long2
;
quit;

I needed two transposes so that the exclusion conditions don't interfere with the other data type.

Mushy
Quartz | Level 8

Hi Kurt, Super Thanks!

Quentin
Super User

I'm still grumpy. : )

 

If PROC COMPARE was ODS compliant, it should be as easy as:

 

ods trace on ;
ods output comparesummary=want ;
proc compare
  data=cl1
  compare=cl2
;
run;
ods trace off ;

This little bit of the output you want clearly could be an ODS object / table:

Variable  Type  Len  Ndif   MaxDif

Sex       CHAR    1     1
Age       NUM     8     1    1.000

But instead, currently PROC COMPARE provides ODS objects that are basically just fixed-length text output you would have to parse yourself:

 

proc print data=want ;
run ;
Obs    type                                 batch

  1     d
  2     d
  3     h                            Observation Summary
  4     h
  5     h                       Observation      Base  Compare
  6     d
  7     d                       First Obs           1        1
  8     d                       First Unequal       7        7
  9     d                       Last  Unequal      10       10
 10     d                       Last  Obs          19       19
 11     d
 12     d      Number of Observations in Common: 19.
 13     d      Total Number of Observations Read from WORK.CL1: 19.
 14     d      Total Number of Observations Read from WORK.CL2: 19.
 15     d
 16     d      Number of Observations with Some Compared Variables Unequal: 2.
 17     d      Number of Observations with All Compared Variables Equal: 17.
 18     d
 19     d
 20     h                         Values Comparison Summary
 21     h
 22     d      Number of Variables Compared with All Observations Equal: 3.
 23     d      Number of Variables Compared with Some Observations Unequal: 2.
 24     d      Total Number of Values which Compare Unequal: 2.
 25     d      Maximum Difference: 1.
 26     d
 27     d
 28     h                       Variables with Unequal Values
 29     h
 30     h                    Variable  Type  Len  Ndif   MaxDif
 31     d
 32     d                    Sex       CHAR    1     1
 33     d                    Age       NUM     8     1    1.000
 34     d

 

The Boston Area SAS Users Group (BASUG) is hosting our in person SAS Blowout on Oct 18!
This full-day event in Cambridge, Mass features four presenters from SAS, presenting on a range of SAS 9 programming topics. Pre-registration by Oct 15 is required.
Full details and registration info at https://www.basug.org/events.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 2000 views
  • 15 likes
  • 6 in conversation