BookmarkSubscribeRSS Feed
thanikondharish
Calcite | Level 5

data ex1 ;
input id $;
cards ;
101
102
103
;
data ex2 ;
input id $ ;
cards ;
101
104
;

 

proc compare data=ex1 compare=ex2  out=diff;
run;

 

I wrote proc compare .

How to create dataset for different values like

dataset1:

101

103

 

data set2:

104

7 REPLIES 7
PeterClemmensen
Tourmaline | Level 20

Do you want to do this with PROC COMPARE or any tool?

PeterClemmensen
Tourmaline | Level 20

I think this is what you want..

 

data ex1 ;
input id $;
cards ;
101
102
103
;
data ex2 ;
input id $ ;
cards ;
101
104
;

proc compare base=ex1 comp=ex2 out=dataset1(keep=id) outbase outnoequal noprint;
run;

proc compare base=ex2 comp=ex1 out=dataset2(keep=id) outbase outnoequal noprint;
run;
ballardw
Super User

Questions arise for your requirement as to are you looking for values of Id in Ex1 that do not appear in Ex2 or are you looking for observations in order values of ID in Ex1 and Ex2?

 

Proc Compare matches rows by order (unless you are using BY and there are still concerns) So if you have

data work.ex2 ;
input id $ ;
cards ;
104
101 
;

The 101 do no match unless you sort both sets by that value and even then you may have issues as the gaps in the variables have mismatched values.

 

If the concern is the values and not the order / observations then perhaps:

data work.ex1 ;
input id $;
cards ; 
101
102
103
;
data work.ex2 ;
input id $ ;
cards ;
101 
104
;

proc sql;
   create table work.inex1notex2 as
   select id from work.ex1
   except 
   select id from work.ex2
   ;
quit;

proc sql;
   create table work.inex2notex1 as
   select id from work.ex2
   except 
   select id from work.ex1
   ;
quit;
thanikondharish
Calcite | Level 5
Use only proc compare
The values in ex2 and don't in ex1

ballardw
Super User

@thanikondharish wrote:
Use only proc compare
The values in ex2 and don't in ex1


Proc Compare is really more intended for data sets that have the same number of observations in the same order (or the likely large number of mismatched values tells that isn't happening).

And value that appears after the other data set is exhausted for comparison is not going to do what you expect. The Proc basically stops examining anything after the number of rows common to both are exhausted. If one set has 50 records and the other 25 the procedure stops comparing after 25 rows are compared. So you have no idea about the other records though th summary will tell when the last match occurred and that the two sets have different number of observations.

 

And if the data isn't sorted you may not get the "match" you want. Example: Both data sets have the exact same values but appear in different order:

data work.one;
   do i= 1 to 10;
   output;
   end;
run;
data work.two;
   do i= 10 to 1 by -1;
   output;
   end;
run;

proc compare base=work.two compare=work.one;
run;

No matches.

Tom
Super User Tom
Super User

PROC COMPARE is NOT the right tool for that type of comparison.  It is for comparing values between MATCHING observations.

Your output is a summary of the NON matching observations.  That is observations that are excluded from one set or the other.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 546 views
  • 0 likes
  • 4 in conversation