Hello,
I have been asked to compare more than two SAS data-sets. There is one base data-set and more than one compare data-sets. Is there any special program that I have to write to do this job?
Thank you,
AB
You can use PROC COMPARE multiple times or you can write in your own. If you're comparing more than a few and need specific output, its usually easiest to write your own IMO.
@ab645 wrote:
Hello,
I have been asked to compare more than two SAS data-sets. There is one base data-set and more than one compare data-sets. Is there any special program that I have to write to do this job?
Thank you,
AB
what comparison do you want do perform?
merge or hash might fill your needs
I might be tempted to start by just looking at dataset/variables characteristics and membership, especially if you aren't actually interested at first in a record by record comparison.
SAS maintains a number of informational sources you can query such as dictionary tables.
Depending on the actual questions you need to get from your comparisons this may be all you need.
The following example builds some similar data sets then provides summaries of numbers of observations and variables per data set and another set of variable summaries such as type, label and formats.
The 1 in the body of the table tells which what is where.
Data work.junk1; informat x $25. y best4. dday mmddyy10.; format dday mmddyy10.; input x y dday; label x='Account ID'; datalines; abc 123 01/01/2015 pdq 4.56 12/12/2012 ; run; Data work.junk2; informat x $20. y $4. dday mmddyy10.; format dday mmddyy10.; input x y dday; label x='Client ID'; datalines; abc 444 01/01/2015 pdq 4.56 12/12/2012 pdq 202 12/12/2012 ; run; Data work.junk3; informat x $15. y $4. dday $10.; input x y dday z ; label x = 'Something Else'; datalines; abc 123 01/01/2015 88 pdq 4.56 12/12/2012 99 ; run; proc sql; create table datasettab as select * from dictionary.tables where libname='WORK' and memname in ('JUNK1' 'JUNK2' 'JUNK3') ; quit; proc tabulate data=datasettab; class memname; class nobs nvar; tables nobs nvar, memname*n='' /misstext=' ' ; run; proc sql; create table vartab as select * from dictionary.columns where libname='WORK' and memname in ('JUNK1' 'JUNK2' 'JUNK3') ; quit; proc tabulate data=vartab; class memname name type; class label format informat/missing; tables name*type, memname*n='' /misstext=' ' ; tables name*label, memname*n='' /misstext=' ' ; tables name*type*format*informat, memname*n='' /misstext=' ' ; run;
So it is easy to see that Junk1 and Junk3 have the same number of observations and Junk1 and Junk2 have the same number of variables.
The second table shows which table each variable is numeric, character. The third shows labels per variable and data set. This might be very helpful. (You do use variable labels don't you). The last shows the variable type with format and informat information which may clarify issues with code working on set where a date variable is actually a SAS date valued variable and not in another where the data is actually character but happens to look like a date.
This could reduce what you look for if you go to use Proc compare as you could use the VAR and WITH options to only compare variables that are likely to be the same.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.