I encountered a situation where I just need to compare the two dataset based on the result I need to give a value they are same or not.
I have the same datasets in different folders. I need to compare both and see they are same ? if they are same then I need to create a variable 'EQUAL'= 'Y' otherwise 'N'. I have the idea of proc compare and I can read the lst file and see if they are equal or not. In my case it's not a suitable idea because I am writing this in macro depending on this further macro will be executed. I thought of counting the observations by outputting the compare dataset but got stuck after words .How we can a achieve this?
In my example, I tried my best to simulate the condition. Excuse me if I have any errors.
According to my example ,
1. I am expecting a variable "EQUAL' = 'Y' ( Even though the lengths are different.)
2. is it possible we can grab if there is a changes in length, format, labels ? and create a variable with the list of the things changes ( I think its too hard to achieve. if we can not that's ok.)
Thank you for your inputs.
How I am expecting the dataset like below . 'LIST' is optional if its too much of work.
libname folder1 'c:\November';
libname folder2 'c:\December';
*in my case dataset name is same both folders . For the purpose to run the code, I created the same file with different name;
data folder1.x1;
set sashelp.class;
run;
data folder2.x2;
length name $30.;
set x1;
run;
proc compare base= folder1.x1 compare= folder2.x2 out= x3 outnoequal noprint;
run;
proc sql;
create table cnt as select count(*) as N from x3;
quit;
/*
That is easy if these two tables have the same variable name with same order and type.
*/
libname folder1 'c:\temp\old';
libname folder2 'c:\temp\new';
data folder1.x1;
set sashelp.class;
run;
data folder2.x2;
set sashelp.class;
run;
data folder1.y1;
set sashelp.heart;
run;
data folder2.y2;
set sashelp.heart;
if _n_=1 then call missing(of _numeric_);
run;
%macro compare_dsn(first=,second= , output=);
proc sql;
create table &output. as
(
select * from &first.
except
select * from &second.
)
union
(
select * from &second.
except
select * from &first.
)
;
quit;
%mend;
proc datasets library=work kill memtype=data nolist nodetails;
quit;
%compare_dsn(first= folder1.x1,second= folder2.x2 , output=x)
%compare_dsn(first= folder1.y1,second= folder2.y2 , output=y)
/*generate the comparing report */
proc sql;
select memname,ifc(nlobs=0,'Y','N') as equal
from dictionary.tables
where libname='WORK' ;
quit;
Do you want to compare the contents or the structures of both datasets?
Thanks for your response. My major focus in Contents ( if the dataset is same or not irrespective of structural changes)
There is a not well known way of doing what you want with PROC COMPARE then checking the return code from the procedure:
proc compare base = BaseTable
compare = CompareTable
listvar out = difs outnoequal
;
run;
%let rc = &sysinfo;
data _null_;
%* Test for data set label ;
if &rc = '1'b then
put "<<<< &table.: Data sets have different labels";
%* Test for data set types ;
if &rc = '1.'b then
put "<<<< &table.: Data set types differ";
%* Test for label ;
if &rc = '1.....'b then
put "<<<< &table.: Variable has different label";
%* Test for base observation ;
if &rc = '1......'b then
put "<<<< &table.: Base data set has observation not in comparison data set";
%* Test for length ;
if &rc = '1....'b then
put "<<<< &table.: Variable has different lengths between the base data set and the comparison data set";
%* Variable in base data set not in compare data set;
if &rc ='1..........'b then
put "<<<< &table.: Variable in base data set not found in comparison data set";
%* Comparison data set has variable not in base data set ;
if &rc = '1...........'b then
put "<<<< &table.: Comparison data set has variable not contained in the base data set";
%* Test for values ;
if &rc = '1............'b then
put "<<<< &table.: A value comparison was unequal";
%* Conflicting variable types;
if &rc ='1.............'b then
put "<<<< &table.: Conflicting variable types between the two data sets being compared";
run;
Thank you @SASKiwi . I thought the same. how about if we just want to check the content? looking data same or not! like run the compare and out the no equal dataset,----> then count the observation, if count= 0 then datasets are equal otherwise NO? and Whats '&Table" in your code refer to?
@SASuserlot - Sorry, I forgot to correct the code I posted:
%let table = MyTable;
proc compare base = Folder1.&Table.
compare = Folder2.&Table.
listvar out = difs outnoequal
;
run;
%let rc = &sysinfo;
data _null_;
%* Test for data set label ;
if &rc = '1'b then
put "<<<< &table.: Data sets have different labels";
%* Test for data set types ;
if &rc = '1.'b then
put "<<<< &table.: Data set types differ";
%* Test for label ;
if &rc = '1.....'b then
put "<<<< &table.: Variable has different label";
%* Test for base observation ;
if &rc = '1......'b then
put "<<<< &table.: Base data set has observation not in comparison data set";
%* Test for length ;
if &rc = '1....'b then
put "<<<< &table.: Variable has different lengths between the base data set and the comparison data set";
%* Variable in base data set not in compare data set;
if &rc ='1..........'b then
put "<<<< &table.: Variable in base data set not found in comparison data set";
%* Comparison data set has variable not in base data set ;
if &rc = '1...........'b then
put "<<<< &table.: Comparison data set has variable not contained in the base data set";
%* Test for values ;
if &rc = '1............'b then
put "<<<< &table.: A value comparison was unequal";
%* Conflicting variable types;
if &rc ='1.............'b then
put "<<<< &table.: Conflicting variable types between the two data sets being compared";
run;
Yes, you could just check the COMPARE output dataset and if there are any rows that is evidence there are data differences between the two tables.
/*
That is easy if these two tables have the same variable name with same order and type.
*/
libname folder1 'c:\temp\old';
libname folder2 'c:\temp\new';
data folder1.x1;
set sashelp.class;
run;
data folder2.x2;
set sashelp.class;
run;
data folder1.y1;
set sashelp.heart;
run;
data folder2.y2;
set sashelp.heart;
if _n_=1 then call missing(of _numeric_);
run;
%macro compare_dsn(first=,second= , output=);
proc sql;
create table &output. as
(
select * from &first.
except
select * from &second.
)
union
(
select * from &second.
except
select * from &first.
)
;
quit;
%mend;
proc datasets library=work kill memtype=data nolist nodetails;
quit;
%compare_dsn(first= folder1.x1,second= folder2.x2 , output=x)
%compare_dsn(first= folder1.y1,second= folder2.y2 , output=y)
/*generate the comparing report */
proc sql;
select memname,ifc(nlobs=0,'Y','N') as equal
from dictionary.tables
where libname='WORK' ;
quit;
Thank you @Ksharp .
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.