Hello,
I have three tables all named diffusion in three different directories. They have the same columns. I have to compare these three tables to know if I have the same values/rows in all 3 tables for the year '2021'. The tables contain values for 2021 and 2022.
I really don't know where and how to start.
I want to do it using proc sql.
(I have erased some information in the picture due to confidentiality).
The SAS tool for comparing datasets is PROC COMPARE.
Are there keys in your datasets which identify unique observations?
Use a WHERE condition to select for the year.
For in-depth help, supply usable examples for at least two datasets in data steps with datalines; pictures are not usable.
So I have used the below code to compare for dates from 1Jan2021 to 1JAN2022
However I receive 120,000 observations and it's difficult for me to point out where the base and compare are not the same. How can I search for the anomalies in the end result please?
data date; format date_debut date9.; format date_fin date9.; annee_2021 = year(date())-1; annee_2022 = year(date()); date_debut = mdy(1,1, input(annee_2021, best.)); date_fin = mdy(1,1, input(annee_2022, best.)); call symput('date_debut', date_debut); call symput('date_fin', date_fin); run; proc sql; create table t1 as select * from adobe.diffusion where ADB_DATE_MODIFICATION >= dhms(&date_debut,0,0,0) and ADB_DATE_MODIFICATION <= dhms(&date_fin,0,0,0); quit; proc sql; create table t2 as select * from prod.diffusion where ADB_DATE_MODIFICATION >= dhms(&date_debut,0,0,0) and ADB_DATE_MODIFICATION <= dhms(&date_fin,0,0,0); quit; proc sort data = t1 out = diff1_bydate; by ADB_DATE_MODIFICATION; run; proc sort data = t2 out = diff2_bydate; by ADB_DATE_MODIFICATION; run; proc compare base = diff1_bydate compare = diff2_bydate out = result outnoequal outbase outcomp outdif noprint; id ADB_DATE_MODIFICATION; title 'Comparaison' ; run;
Suggest, if you are only interested in a few variables that you add a VAR statement to the Proc Compare code.
The VAR statement will restrict the comparisons to only the variables on the statement. So supply the date variables of interest.
With Proc Compare you also want to sort the data sets in some fashion that the order makes sense for comparing records. I am not sure that dates alone are appropriate if you want to see if the dates associated with some characteristic such as company, person, instrument or other grouping.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.