Hi,
I have similar datasets.
Dataset A has 376 variables and
Dataset B has 428 variables.
How do I get a list of the variables which are not in both datasets?
Thanks
Proc Compare is one tool that provides that information (as well as if the same-named variables are of different types).
Proc compare base=datasetA compare=datasetb novalues listvar;
run;
1. Get a list of variables from each data set
Use either SASHELP.VCOLUMN or PROC CONTENTS
2A. Use SQL to compare them
2B. Use PROC COMPARE to compare the lists
2C. Use a data step to do the comparison.
I have an example of the 2C approach here - note this is a fully worked example so the first few steps are generating sample data. Please make sure to read the comments.
*Here's a bit of a long way to generate the list of datasets with variables present.
This program looks at ALL variables, to restrict it to a select list, see the line of
code to be modified via the comments.;
/*Generate fake data to work with*/
data class1;
set sashelp.class;
drop age sex;
run;
data class2;
set sashelp.class;
drop weight height;
run;
data class3;
set sashelp.class;
Order=1;
run;
data class4;
set sashelp.class;
keep name;
run;
data class5;
set sashelp.class;
run;
/*Extract metadata from dictionary tables*/
proc sql noprint;
create table column_list as select memname, libname, name, type, 1 as count
from dictionary.columns where libname='WORK' and memname like 'CLASS%'
/*ADD THE VARIABLE LIST HERE*/
order by memname, name;
quit;
/*Transpose results to a more user friendly format*/
proc transpose data=column_list out=flipped;
by memname;
id name;
idlabel name;
var count;
run;
/*Format output*/
data want;
retain memname;
set flipped;
array class(*) _NUMERIC_;
do i=1 to dim(class);
if class(i)=. then
class(i)=0;
end;
All_Variables=sum(of _numeric_)-I;
DROP I _NAME_;
run;
https://gist.github.com/statgeek/3b57ae085d9f7a36a2d95c15f04e72e6
And a slightly different implementation:
Proc Compare is one tool that provides that information (as well as if the same-named variables are of different types).
Proc compare base=datasetA compare=datasetb novalues listvar;
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.