i have two datasets that created in different years. i want to check to see if all variables in the current year are also in the previous year and vs visa. if not, i will print out the variable name. i find a sas code online to do this work but it only can do one variable at a time. i have 70 variables in each datasets and want to check them all in one time. i am not good on macro so i tried several codes online but can not make them work.
data year1;
input gender race age;
datalines;
1 1 20
2 4 30
3 5 40
1 7 50
2 9 60
3 5 70
;
run;
data year2;
input gender race age risk;
datalines;
1 1 30 0
1 1 50 1
1 2 60 2
2 3 70 3
2 4 40 4
2 5 40 5
;
run;
this is the code i find online but seems doesn't work for some reason:
%let dsn1=year1;
%let dsn2=year2;
%let dsid=%sysfunc(open(&dsn1,i));
%let nvar1=%sysfunc(attrn(&dsid,nvars));
%do i=1 %to &nvar1;
%let name1=%sysfunc(varname(&dsid,&i));
%let list1=%str( &list1 &name1 );
%end;
%let bad1=&list1;
%let rc=%sysfunc(close(&dsid));
%let dsid=%sysfunc(open(&dsn2,i));
%let nvar2=%sysfunc(attrn(&dsid,nvars));
%let common=;
%do i=1 %to &nvar2;
%let name2=%sysfunc(varname(&dsid,&i));
%let list2=%str( &list2 &name2 );
%let pos_a=
%eval(%index(&bad1,%str( &name2 )));
%let len_var=%length(&name2);
%let pos_b=%eval(&pos_a + &len_var + 1);
%if (&pos_a = 0) %then
%let bad2=%str( &bad2 &name2 );
%else %do;
%let bad1=%str( )
%substr(&bad1,1,&pos_a)%substr(&bad1,&pos_b)
%str( );
%let common=%str( &common &name2 );
%end;
proc sort data=&dsn1(keep=&common);
by &common;
run;
data _prob;
merge &dsn1(in=indsn1) &dsn2(in=indsn2);
by &common;
length inds $ 40;
if indsn1 and not indsn2 then do;
inds="&dsn1";
output;
end;
if not indsn1 and indsn2 then do;
inds="&dsn2";
output;
end;
run;
SAS keeps lots of information available. The data view SASHELP.VCOLUMN has the names and assigned properties of all the variables in all the currently active libraries.
So you can use Sashelp.vcolumn as input to any procedure. Coupled with a Where clause to filter to records you want there are lots of things you can do.
(code corrected to use correct variable name)
proc tabulate data=sashelp.vcolumn; where libname='WORK' andmembermemname in ('YEAR1' 'YEAR2'); classmembermemname name type; table name*type,membermemname ; run;
Your example code would by default place the data sets in the WORK library. So the where clause uses that library and Memname is the variable that holds data set or view names. Libname and Memname are both stored in uppercase so you need to use upper case to find them.
The above Proc tabulate will have row for each variable in either data set, row for character or numeric if the variable is both or just the one if it is the same in both (or only data set). There will be a column for each data set. The cells with a 1 indicate where a variable appears. If in both columns then in both sets, or if only one then under that set.
I like this Proc Tabulate approach because it works a bit nicer when you need to compare more than two sets at a time. For instance if you omit the Memname part of the Where clause you get all the datasets in a library. Caution: lots of sets with lots of variables may make the output hard to read or possibly exceed working memory.
Don't write a macro. Use PROC COMPARE:
PROC COMPARE generates the following information about the two data sets that are being compared:
SAS keeps lots of information available. The data view SASHELP.VCOLUMN has the names and assigned properties of all the variables in all the currently active libraries.
So you can use Sashelp.vcolumn as input to any procedure. Coupled with a Where clause to filter to records you want there are lots of things you can do.
(code corrected to use correct variable name)
proc tabulate data=sashelp.vcolumn; where libname='WORK' andmembermemname in ('YEAR1' 'YEAR2'); classmembermemname name type; table name*type,membermemname ; run;
Your example code would by default place the data sets in the WORK library. So the where clause uses that library and Memname is the variable that holds data set or view names. Libname and Memname are both stored in uppercase so you need to use upper case to find them.
The above Proc tabulate will have row for each variable in either data set, row for character or numeric if the variable is both or just the one if it is the same in both (or only data set). There will be a column for each data set. The cells with a 1 indicate where a variable appears. If in both columns then in both sets, or if only one then under that set.
I like this Proc Tabulate approach because it works a bit nicer when you need to compare more than two sets at a time. For instance if you omit the Memname part of the Where clause you get all the datasets in a library. Caution: lots of sets with lots of variables may make the output hard to read or possibly exceed working memory.
@juliajulia wrote:
This is exactly what i am looking for. thank you for your help. The 'member' in the original code can not be find, so i changed to 'memname' and it works.
proc tabulate data=sashelp.vcolumn;
where libname='WORK' and member in ('YEAR1' 'YEAR2');
class memname name type;
table name*type,
member
;
run;
My mistake too late at night.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.