BookmarkSubscribeRSS Feed
MILKYLOVE
Calcite | Level 5

Hello,

I have three tables all named diffusion in three different directories. They have the same columns. I have to compare these three tables to know if I have the same values/rows in all 3 tables for the year '2021'. The tables contain values for 2021 and 2022. 

I really don't know where and how to start.

I want to do it using proc sql.

(I have erased some information in the picture due to confidentiality).

 

Screenshot 2022-07-13 101850.png

3 REPLIES 3
Kurt_Bremser
Super User

The SAS tool for comparing datasets is PROC COMPARE.

Are there keys in your datasets which identify unique observations?

Use a WHERE condition to select for the year.

 

For in-depth help, supply usable examples for at least two datasets in data steps with datalines; pictures are not usable.

MILKYLOVE
Calcite | Level 5

So I have used the below code to compare for dates from 1Jan2021 to 1JAN2022 

 

However I receive 120,000 observations and it's difficult for me to point out where the base and compare are not the same. How can I search for the anomalies in the end result please?

 

data date;
     format date_debut date9.;
     format date_fin date9.;
     annee_2021 = year(date())-1;
     annee_2022 = year(date());
     date_debut = mdy(1,1, input(annee_2021, best.));
     date_fin = mdy(1,1, input(annee_2022, best.));
     call symput('date_debut', date_debut);
     call symput('date_fin', date_fin);
run;


proc sql;
create table t1 as 
select * from adobe.diffusion 
where ADB_DATE_MODIFICATION >= dhms(&date_debut,0,0,0)
and ADB_DATE_MODIFICATION <= dhms(&date_fin,0,0,0);
quit;


proc sql;
create table t2 as 
select * from prod.diffusion 
where ADB_DATE_MODIFICATION >= dhms(&date_debut,0,0,0)
and ADB_DATE_MODIFICATION <= dhms(&date_fin,0,0,0);
quit;

proc sort data = t1 out = diff1_bydate;
by ADB_DATE_MODIFICATION;
run;

proc sort data = t2 out = diff2_bydate;
by ADB_DATE_MODIFICATION;
run;

proc compare base = diff1_bydate compare = diff2_bydate
out = result outnoequal outbase outcomp outdif
noprint;
id ADB_DATE_MODIFICATION;
title 'Comparaison' ;
run;
ballardw
Super User

Suggest, if you are only interested in a few variables that you add a VAR statement to the Proc Compare code.

The VAR statement will restrict the comparisons to only the variables on the statement. So supply the date variables of interest.

 

With Proc Compare you also want to sort the data sets in some fashion that the order makes sense for comparing records. I am not sure that dates alone are appropriate if you want to see if the dates associated with some characteristic such as company, person, instrument or other grouping.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 960 views
  • 0 likes
  • 3 in conversation