Hello,
In proc compare, is there a way to copy ONLY the variables names with difference in values between two tables.
Thanks,
Mushy
It is more complicated, but it's doable:
data cl1;
set sashelp.class;
run;
data cl2;
set sashelp.class;
if name = "John" then age = 13;
if name = "Jane" then sex = "M";
run;
proc compare
data=cl1
compare=cl2
noprint
out=comp
outnoequal
outstats=stat (where=(_type_ = "NDIF" and (_base_ ne 0)))
;
run;
proc transpose
data=comp (drop=_type_)
out=long1 (where=(_name_ ne "_OBS_" and not missing(col1)))
;
by _obs_;
var _numeric_;
run;
proc transpose
data=comp (drop=_type_)
out=long2 (where=(_name_ ne "_OBS_" and compress(col1,".") > " "))
;
by _obs_;
var _character_;
run;
proc sql;
create table want as
select distinct _name_
from long1
union
select distinct _name_
from long2
;
quit;
I needed two transposes so that the exclusion conditions don't interfere with the other data type.
Hello,
That question is not clear to me.
What do you mean with "COPY"?
This is how I use PROC COMPARE :
/* example usage of PROC COMPARE */
proc compare base=emp95_byidnum compare=emp96_byidnum
out=result outnoequal outbase outcomp outdif
outstats=diffstat noprint;
id idnum;
run;
Thanks,
Koen
This will write only the variables with differences to the OUTSTATS= dataset:
data cl1;
set sashelp.class;
run;
data cl2;
set sashelp.class;
if name = "John" then age = 13;
run;
proc compare
data=cl1
compare=cl2
noprint
outstats=stat (where=(_type_ = "NDIF" and (_base_ ne 0)))
;
run;
Oh, I LIKE that 🙂 Very nice.
TBH, this is one of the questions that I deal with in Maxim 13 (see the Talmud quote). The OP's question made me play around with PROC COMPARE to find out how variables with differences show up in the output datasets.
My first try was at using the OUT= dataset, TRANSPOSE the numerics with non-missing values, and sort that with NODUPKEY, but then I thought of the statistics, and - lo and behold! - there's NDIF. Then it was just a simple WHERE=.
This is nice. If there's any spare development time at SAS for PROC COMPARE, would love to see it made more ODS compliant, i.e. ability to use the ODS OUTPUT statement to get usable datasets out of PROC COMPARE.
Hello Kurt,
Why does the "stat" table does not contain all variables in sashelp.class? It displays stat only for age,height and weight?
proc compare
data=cl1
compare=cl2
noprint
outstats=stat
;
run;
Thanks,
Mushy
@Mushy wrote:
Hello Kurt,
Why does the "stat" table does not contain all variables in sashelp.class? It displays stat only for age,height and weight?
proc compare
data=cl1
compare=cl2
noprint
outstats=stat
;
run;
Thanks,
Mushy
When you look at the output data set a key indicator the _type_ column. Note what it contains: N, Mean, Max, Min, STD and a few other values. These statistics other than N would be meaningless for character variables. What would the "mean" of the Name variable look like? Or the standard deviation? These stats are only calculated for numeric values to have meaning.
@Mushy wrote:
Hello Kurt,
Why does the "stat" table does not contain all variables in sashelp.class? It displays stat only for age,height and weight?
proc compare
data=cl1
compare=cl2
noprint
outstats=stat
;
run;
Thanks,
Mushy
That is because the STAT= output datasets contains statistics that can only be applied to numeric values, so character variables do not appear there. If you run the compare without NOPRINT, you will see that the NAME and SEX variables are also compared:
Anzahl der verglichenen Variablen, bei denen alle Werte gleich sind: 4. Anzahl der verglichenen Variablen, bei denen einige Werte ungleich sind: 1.
(German output because of my locale, but you can see all 5 variables have been processed)
If you need to find character variables also, then you need to go the more complicated way by using an OUT= dataset and processing that.
Hello Kurt,
Do you already have a code/method to get to know all the variables with differences with out= irrespective of the type? I looked into it and seems to be a bit complicated
It is more complicated, but it's doable:
data cl1;
set sashelp.class;
run;
data cl2;
set sashelp.class;
if name = "John" then age = 13;
if name = "Jane" then sex = "M";
run;
proc compare
data=cl1
compare=cl2
noprint
out=comp
outnoequal
outstats=stat (where=(_type_ = "NDIF" and (_base_ ne 0)))
;
run;
proc transpose
data=comp (drop=_type_)
out=long1 (where=(_name_ ne "_OBS_" and not missing(col1)))
;
by _obs_;
var _numeric_;
run;
proc transpose
data=comp (drop=_type_)
out=long2 (where=(_name_ ne "_OBS_" and compress(col1,".") > " "))
;
by _obs_;
var _character_;
run;
proc sql;
create table want as
select distinct _name_
from long1
union
select distinct _name_
from long2
;
quit;
I needed two transposes so that the exclusion conditions don't interfere with the other data type.
Hi Kurt, Super Thanks!
I'm still grumpy. : )
If PROC COMPARE was ODS compliant, it should be as easy as:
ods trace on ;
ods output comparesummary=want ;
proc compare
data=cl1
compare=cl2
;
run;
ods trace off ;
This little bit of the output you want clearly could be an ODS object / table:
Variable Type Len Ndif MaxDif Sex CHAR 1 1 Age NUM 8 1 1.000
But instead, currently PROC COMPARE provides ODS objects that are basically just fixed-length text output you would have to parse yourself:
proc print data=want ;
run ;
Obs type batch 1 d 2 d 3 h Observation Summary 4 h 5 h Observation Base Compare 6 d 7 d First Obs 1 1 8 d First Unequal 7 7 9 d Last Unequal 10 10 10 d Last Obs 19 19 11 d 12 d Number of Observations in Common: 19. 13 d Total Number of Observations Read from WORK.CL1: 19. 14 d Total Number of Observations Read from WORK.CL2: 19. 15 d 16 d Number of Observations with Some Compared Variables Unequal: 2. 17 d Number of Observations with All Compared Variables Equal: 17. 18 d 19 d 20 h Values Comparison Summary 21 h 22 d Number of Variables Compared with All Observations Equal: 3. 23 d Number of Variables Compared with Some Observations Unequal: 2. 24 d Total Number of Values which Compare Unequal: 2. 25 d Maximum Difference: 1. 26 d 27 d 28 h Variables with Unequal Values 29 h 30 h Variable Type Len Ndif MaxDif 31 d 32 d Sex CHAR 1 1 33 d Age NUM 8 1 1.000 34 d
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.