Hi, I am trying to identify the specific variables that are found in dataset1 but not in dataset2. Proc compare can find the common ones but does not list the ones that are distinct:
proc compare data=data1 (obs=0) compare=data2(obs=0); run;
How can I find the unique variables?
Thanks.
@chipp635 wrote:
Hi, I am trying to identify the specific variables that are found in dataset1 but not in dataset2. Proc compare can find the common ones but does not list the ones that are distinct:
proc compare data=data1 (obs=0) compare=data2(obs=0); run;How can I find the unique variables?
I think you may not be researching the appropriate syntax:
When I run this on my data sets:
proc compare data=ipp.hd1_2014(obs=0) compare=ipp.hd457_2014 (obs=0) listbasevar listcompvar ; run;
I get (as expected) in the output
Listing of Variables in IPP.HD1_2014 but not in IPP.HD457_2014 Variable Type Length Label GSpec Char 1 GC Specimen Site Listing of Variables in IPP.HD457_2014 but not in IPP.HD1_2014 Variable Type Length Informat Label ACCESSION_NUMBER Char 16 $16. Record Identifier ORACE Char 12 $STD_ORACE.
I have some example code on that here. I don't find PROC COMPARE particularly useful - they could do a lot to improve it.
I have two demo codes that are useful to me:
create a table that shows which variables are in which datasets:
https://gist.github.com/statgeek/3b57ae085d9f7a36a2d95c15f04e72e6
Identifiers variables in one data set and not in another:
https://gist.github.com/statgeek/e0b98c4627aa31a567e5
@chipp635 wrote:
Hi, I am trying to identify the specific variables that are found in dataset1 but not in dataset2. Proc compare can find the common ones but does not list the ones that are distinct:
proc compare data=data1 (obs=0) compare=data2(obs=0); run;How can I find the unique variables?
Thanks.
Thanks. Would it be possible to do it across multiple datasets and form a table of lets say:
Data1 Data2 Data3 Data4
var1 var1 var1 var1
var2 var2 var2 var2
var3 var3 var3
var4
where matching variables are listed in the same order but distinct ones are listed additionally as well?
.
That's almost exactly what the first program does.
@chipp635 wrote:
Thanks. Would it be possible to do it across multiple datasets and form a table of lets say:
Data1 Data2 Data3 Data4
var1 var1 var1 var1
var2 var2 var2 var2
var3 var3 var3
var4
where matching variables are listed in the same order but distinct ones are listed additionally as well?
.
/*create a uncommon variable for test purpose*/
data w;
set sashelp.class;
_name=name;
run;
data uncommon;
if _n_=1 then do;
dcl hash H (dataset:"sashelp.vcolumn(where=(libname='SASHELP' and memname='CLASS' ))") ;
h.definekey ("name") ;
h.definedone () ;
end;
set sashelp.vcolumn(where=(libname='WORK' and memname='W'));
if h.check() ne 0;
keep name;
run;
@chipp635 wrote:
Hi, I am trying to identify the specific variables that are found in dataset1 but not in dataset2. Proc compare can find the common ones but does not list the ones that are distinct:
proc compare data=data1 (obs=0) compare=data2(obs=0); run;How can I find the unique variables?
I think you may not be researching the appropriate syntax:
When I run this on my data sets:
proc compare data=ipp.hd1_2014(obs=0) compare=ipp.hd457_2014 (obs=0) listbasevar listcompvar ; run;
I get (as expected) in the output
Listing of Variables in IPP.HD1_2014 but not in IPP.HD457_2014 Variable Type Length Label GSpec Char 1 GC Specimen Site Listing of Variables in IPP.HD457_2014 but not in IPP.HD1_2014 Variable Type Length Informat Label ACCESSION_NUMBER Char 16 $16. Record Identifier ORACE Char 12 $STD_ORACE.
Thank you!
I was missing the listbasevar listcompvar code.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.