I would like to compare two datasets and find the non matching records. In below example, both the datasets has same Observation and I believe only order of the records should be differ in both the datasets. When I run the following code, it says 4096 records are differ in both the datasets.
I would like to know whether proc compare will consider the order of the record also to be identical and I wanted to the best method to compare two datasets irrespective of order of the record. If there is a difference, I want to see the records which differs.
proc compare base=test compare=test1;
run;
%put &sysinfo.;
Log:
26 proc compare base=test compare=test1;
27 run;
NOTE: There were 5985917 observations read from the data set WORK.TEST.
NOTE: There were 5985917 observations read from the data set WORK.TEST1.
NOTE: At least one W.D format was too small for the number to be printed. The decimal may be shifted by the "BEST" format.
NOTE: PROCEDURE COMPARE used (Total process time):
real time 28.39 seconds
cpu time 27.54 seconds
28
29 %put &sysinfo.;
SYMBOLGEN: Macro variable SYSINFO resolves to 4096
How about the following
data a;
input ID var1 var2;
datalines;
001 1 2
002 2 2
003 1 2
004 1 2
006 1 3
;
run;
data b;
input ID var1 var2;
datalines;
001 1 2
002 1 2
003 1 2
004 1 3
007 2 2
;
run;
proc compare base=a comp=b noprint OUTNOEQUAL OUT=COMP ;
by ID;
run;
Yes, PROC COMPARE considers observation order. Sort your two data sets by the same variable and compare again. In my book, PROC COMPARE is still the best tool for the job.
See this small example
data a;
input var1 var2;
datalines;
1 2
3 4
;
data b;
input var1 var2;
datalines;
3 4
1 2
;
proc compare base=a comp=b;
run;
How about the following
data a;
input ID var1 var2;
datalines;
001 1 2
002 2 2
003 1 2
004 1 2
006 1 3
;
run;
data b;
input ID var1 var2;
datalines;
001 1 2
002 1 2
003 1 2
004 1 3
007 2 2
;
run;
proc compare base=a comp=b noprint OUTNOEQUAL OUT=COMP ;
by ID;
run;
I'm unable to understand the data in the Output dataset COMP.
A description of the OUT= dataset is found here.
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Select SAS Training centers are offering in-person courses. View upcoming courses for: