DATA Step, Macro, Functions and more

compare data sets and extract the delta comparing all the variables

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 127
Accepted Solution

compare data sets and extract the delta comparing all the variables

Dear experts,

 

I am looking fo an effective way (I assume proc compare or proc sql) to compare 2 data sets. The second data set should contain almost the same data of the previou one (some small changes) and additional observations.

My aim is to get the complete delta between the two data sets, i.e. the changes to the existing data and the new ones.

Any suggestion? Thanks a lot, SH.


Accepted Solutions
Solution
‎05-25-2016 05:43 AM
Super User
Super User
Posts: 7,403

Re: compare data sets and extract the delta comparing all the variables

Well, proc compare is the obvious choice, you can send the output to a dataset, and then further process it for a nice output.  I find its sometimes a bit difficult to review though.  I would suggest you try the two options, sometimes, if I want to see data in one and not the other i might use SQL:

proc sql;
  select *
  from   HAVE1
  except 
  select * 
  from   HAVE2;
quit;

Its a bit difficult as its very much a matter of preference, what you want the output to look like, what kind of level of checking etc.  Could be merging one to the other and then doing processing in datastep/sql would be helpful, i.e. if values are similar/match but don't logically etc.

 

View solution in original post


All Replies
Solution
‎05-25-2016 05:43 AM
Super User
Super User
Posts: 7,403

Re: compare data sets and extract the delta comparing all the variables

Well, proc compare is the obvious choice, you can send the output to a dataset, and then further process it for a nice output.  I find its sometimes a bit difficult to review though.  I would suggest you try the two options, sometimes, if I want to see data in one and not the other i might use SQL:

proc sql;
  select *
  from   HAVE1
  except 
  select * 
  from   HAVE2;
quit;

Its a bit difficult as its very much a matter of preference, what you want the output to look like, what kind of level of checking etc.  Could be merging one to the other and then doing processing in datastep/sql would be helpful, i.e. if values are similar/match but don't logically etc.

 

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 1 reply
  • 221 views
  • 1 like
  • 2 in conversation