Scenario: I have two datasets: A and B. I would to know which variables they have in common, which variables are in A but not in B and vice versa.
Question: If you know any handy SAS program to accomplish this task, please share with me.
Thanks in advance.
Proc compare.
Basic syntax Proc compare base=<datasetname> compare=<datasetname>; run;
Here's a variation you may like. Extending an earlier suggestion:
proc compare data=data1 (obs=0) compare=data2 (obs=0);
run;
That way, you get all the structural differences (not just variable names) but you don't get all the detail about which data values are different. For example, if the same variable name appears in both data sets, but is defined as character in one but numeric in the other, you may want to know that.
Good luck.
Here is one way, although I'm sure there are many:
/*create some test data*/
data have1;
set sashelp.class;
retain x y z (0,0,0);
run;
data have2;
set sashelp.class;
retain a b c (0,0,0);
run;
proc sql;
create table names1 as
select name
from dictionary.columns
where libname="WORK" and
memname="HAVE1"
order by name
;
create table names2 as
select name
from dictionary.columns
where libname="WORK" and
memname="HAVE2"
order by name
;
quit;
data want;
merge names1 (in=in1) names2 (in=in2);
by name;
if in1 and in2 then type="in both";
else if in1 then type="in 1";
else if in2 then type="in 2";
run;
proc print data=want;
run;
data have1; set sashelp.class; retain x y z (0,0,0); run; data have2; set sashelp.class; retain a b c (0,0,0); run; proc sql; create table common_var as select name from dictionary.columns where libname='WORK' and memname='HAVE1' intersect select name from dictionary.columns where libname='WORK' and memname='HAVE2'; create table in_a as select name from dictionary.columns where libname='WORK' and memname='HAVE1' except select name from dictionary.columns where libname='WORK' and memname='HAVE2'; create table in_b as select name from dictionary.columns where libname='WORK' and memname='HAVE2' except select name from dictionary.columns where libname='WORK' and memname='HAVE1'; quit;
Ksharp
Of course, like ballardw suggested, proc compare will give you whole-9-yard comparison. But if what you want is just variable names, here is an another approach:
data h1;
set sashelp.class;
rename age=h1_age;
run;
data h2;
set h1;
rename h1_age=h2_age;
run;
proc contents data=h1 out=h1out (keep=name);
run;
proc contents data=h2 out=h2out(keep=name);
run;
proc sort data=h1out;
by name;
run;
proc sort data=h2out;
by name;
run;
data common h1only h2only;
merge h1out(in=h1) h2out(in=h2);
by name;
if h1 and h2 then output common;
else if h1 and not h2 then output h1only;
else output h2only;
run;
Haikuo
Great, works for me, thanks!
My favorite approach is also SQL/Dictionary Tables , but for the sake of variety, one could also use the approach below, again relying on Proc Contents:
proc contents data=sashelp.class (keep=weight age sex) out=X (keep=name) noprint;
run;
proc contents data=sashelp.class (rename=(name=tag weight=mass)) out=Y (keep=name) noprint;
run;
data XnotinY YnotinX;
if _N_=1 then do;
declare hash hx (dataset:"X");
hx.definekey("name");
hx.definedata("name");
hx.definedone();
declare hash hy (dataset:"Y");
hy.definekey("name");
hy.definedata("name");
hy.definedone();
end;
do until(xdone);
set X end=xdone;;
if hy.check() ne 0 then output XnotinY;
call missing(name);
end;
do until(ydone);
set Y end=ydone;
if hx.check() ne 0 then output YnotinX;
call missing(name);
end;
run;
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.