Hi All,
I want to see only records with differences included in my output with the structure shown below:
_TYPE_ |
obs |
var1 |
var2 |
var3 |
var4 |
BASE |
1 |
34 |
115003481 |
1214 |
48 |
COMPARE |
1 |
34 |
115003481 |
1214 |
48 |
DIF |
1 |
34 |
115003481 |
1214 |
48 |
The program used was:
proc compare base=F.&MATCH1 compare=F.&MATCH2
OUT=F.&DIFF
outbase
outcomp
outdif
;
id var1 var2 var3 var4 ;
run;
Number of Observations in Common: 4083722.
Number of Observations in F.MATCH1 but not in F.MATCH2: 158.
Total Number of Observations Read from F.MATCH1: 4083722.
Total Number of Observations Read from F.MATCH2: 4083880.
Number of Observations with Some Compared Variables Unequal: 2.
Number of Observations with All Compared Variables Equal: 4083720.
Values Comparison Summary
Number of Variables Compared with All Observations Equal: 173.
Number of Variables Compared with Some Observations Unequal: 2.
Number of Variables with Missing Value Differences: 2.
Total Number of Values which Compare Unequal: 2.
Maximum Difference: 0.
Why my output file is generating more than 12 millions of observations -including that are exactly equals in both files- ??, I only want on my ouput file with unequals observations, like this:
Number of Observations with Some Compared Variables Unequal: 2.
Thanks in advance and regards,
JC
Looking at the documenation (or below in red) you can see why you're getting the behaviour you've described. You can fix this by specifying the OUTNOEQUAL option instead. See the code below.
Documentation for PROC COMPARE:
writes an observation for each observation in the base data set.
writes an observation for each observation in the comparison data set
writes an observation to the output data set for each pair of matching observations.
suppresses the writing of observations when all values are equal.
*sample data to check;
data class;
set sashelp.class;
if age=12 then weight=50;
run;
*proc compare test;
proc compare data=class compare=sashelp.class out=check outnoequal ;
id name;
run;
*results;
proc print data=check;run;
Looking at the documenation (or below in red) you can see why you're getting the behaviour you've described. You can fix this by specifying the OUTNOEQUAL option instead. See the code below.
Documentation for PROC COMPARE:
writes an observation for each observation in the base data set.
writes an observation for each observation in the comparison data set
writes an observation to the output data set for each pair of matching observations.
suppresses the writing of observations when all values are equal.
*sample data to check;
data class;
set sashelp.class;
if age=12 then weight=50;
run;
*proc compare test;
proc compare data=class compare=sashelp.class out=check outnoequal ;
id name;
run;
*results;
proc print data=check;run;
Thanks man!!
It worked for me !!
Hi Jcorti,
I have been using OUTNOEQUAL option while using proc compare and still getting the equal values in the output. Please find attached the screenshot
Hi Reeza,
If you can give your inputs.
Thanks,
Hi,
I would like to create a dataset that shows the values produced by OUTBASE and OUTCOMP but only for people with unequal values. Is there a way to do this? The following code, even with OUTNOEQUAL, shows the values for all observations.
proc compare base=HLA_internal_test comp=HLA_external_test out=dif2 outbase outcomp outnoequal;
ID PID;
run;
Thank you in advance!
@GAL1986 personally I don't use PROC COMPARE, I would do a custom SQL or data step since PROC COMPARE output is difficult to wrangle. If you really want PROC COMPARE you likely need to take the full data out and restrict the data set in a second data step anyways.
PS. Please post your own question, not as a response to a thread that's 9 months old.
@GAL1986 wrote:
Hi,
I would like to create a dataset that shows the values produced by OUTBASE and OUTCOMP but only for people with unequal values. Is there a way to do this? The following code, even with OUTNOEQUAL, shows the values for all observations.
proc compare base=HLA_internal_test comp=HLA_external_test out=dif2 outbase outcomp outnoequal;
ID PID;
run;
Thank you in advance!
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.