- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi All,
I want to see only records with differences included in my output with the structure shown below:
_TYPE_ |
obs |
var1 |
var2 |
var3 |
var4 |
BASE |
1 |
34 |
115003481 |
1214 |
48 |
COMPARE |
1 |
34 |
115003481 |
1214 |
48 |
DIF |
1 |
34 |
115003481 |
1214 |
48 |
The program used was:
proc compare base=F.&MATCH1 compare=F.&MATCH2
OUT=F.&DIFF
outbase
outcomp
outdif
;
id var1 var2 var3 var4 ;
run;
Number of Observations in Common: 4083722.
Number of Observations in F.MATCH1 but not in F.MATCH2: 158.
Total Number of Observations Read from F.MATCH1: 4083722.
Total Number of Observations Read from F.MATCH2: 4083880.
Number of Observations with Some Compared Variables Unequal: 2.
Number of Observations with All Compared Variables Equal: 4083720.
Values Comparison Summary
Number of Variables Compared with All Observations Equal: 173.
Number of Variables Compared with Some Observations Unequal: 2.
Number of Variables with Missing Value Differences: 2.
Total Number of Values which Compare Unequal: 2.
Maximum Difference: 0.
Why my output file is generating more than 12 millions of observations -including that are exactly equals in both files- ??, I only want on my ouput file with unequals observations, like this:
Number of Observations with Some Compared Variables Unequal: 2.
Thanks in advance and regards,
JC
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Looking at the documenation (or below in red) you can see why you're getting the behaviour you've described. You can fix this by specifying the OUTNOEQUAL option instead. See the code below.
Documentation for PROC COMPARE:
OUTBASE
writes an observation for each observation in the base data set.
OUTCOMP
writes an observation for each observation in the comparison data set
OUTDIF
writes an observation to the output data set for each pair of matching observations.
OUTNOEQUAL
suppresses the writing of observations when all values are equal.
*sample data to check;
data class;
set sashelp.class;
if age=12 then weight=50;
run;
*proc compare test;
proc compare data=class compare=sashelp.class out=check outnoequal ;
id name;
run;
*results;
proc print data=check;run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Looking at the documenation (or below in red) you can see why you're getting the behaviour you've described. You can fix this by specifying the OUTNOEQUAL option instead. See the code below.
Documentation for PROC COMPARE:
OUTBASE
writes an observation for each observation in the base data set.
OUTCOMP
writes an observation for each observation in the comparison data set
OUTDIF
writes an observation to the output data set for each pair of matching observations.
OUTNOEQUAL
suppresses the writing of observations when all values are equal.
*sample data to check;
data class;
set sashelp.class;
if age=12 then weight=50;
run;
*proc compare test;
proc compare data=class compare=sashelp.class out=check outnoequal ;
id name;
run;
*results;
proc print data=check;run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks man!!
It worked for me !!
- Tags:
- gclub
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi Jcorti,
I have been using OUTNOEQUAL option while using proc compare and still getting the equal values in the output. Please find attached the screenshot
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi Reeza,
If you can give your inputs.
Thanks,
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I would like to create a dataset that shows the values produced by OUTBASE and OUTCOMP but only for people with unequal values. Is there a way to do this? The following code, even with OUTNOEQUAL, shows the values for all observations.
proc compare base=HLA_internal_test comp=HLA_external_test out=dif2 outbase outcomp outnoequal;
ID PID;
run;
Thank you in advance!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@GAL1986 personally I don't use PROC COMPARE, I would do a custom SQL or data step since PROC COMPARE output is difficult to wrangle. If you really want PROC COMPARE you likely need to take the full data out and restrict the data set in a second data step anyways.
PS. Please post your own question, not as a response to a thread that's 9 months old.
@GAL1986 wrote:
Hi,
I would like to create a dataset that shows the values produced by OUTBASE and OUTCOMP but only for people with unequal values. Is there a way to do this? The following code, even with OUTNOEQUAL, shows the values for all observations.
proc compare base=HLA_internal_test comp=HLA_external_test out=dif2 outbase outcomp outnoequal;
ID PID;
run;
Thank you in advance!