Hello:
I would like to change some variable names in old dataset. I would like to compare those two variable names in Old and New dataset. I use proc compare. However, i didn't see it list the details for the variable names. How to get the complete list variables of both dataset?
Thanks.
Proc Compare BASE=&State COMPARE=&State.update outnoequal out=toprint;
TITLE1 "&State original names vs Changed Names";
Run;
Oh, so the goal is to get a list of variables that are in only one of the datasets, rather than comparing values of the data?
I like @Tom's idea of PROC COMPARE LISTALL. But I couldn't see an easy way to get an output dataset with the variables:
data myclass;
set sashelp.class(rename=(height=MyHeight));
run;
proc compare base=sashelp.class compare=myclass listall;
run;
Similar to @ballardw's approach, you use PROC SQL to read dictionary.columns and find the variables that are only in one dataset (but as noted before dictionary.columns gets BIG if you have a lot of libraries defined, so PROC CONTENTS out= followed by merge or SQL is certainly reasonable as well):
proc sql;
select libname, memname, name
from dictionary.columns
where (libname="SASHELP" and memname="CLASS")
or (libname="WORK" and memname="MYCLASS")
group by name
having count(*)=1
;
quit;
Do you mean you renamed variables, and then want to compare the variables from the old dataset with the old names to the variables in the new dataset with the new names?
You can use VAR statement and WITH statement to list the variables you want to compare, e.g.:
data myclass;
set sashelp.class(rename=(height=MyHeight));
run;
proc compare base=sashelp.class compare=myclass;
var height;
with MyHeight;
run;
Is there a way to list the full variable names? Not just the correct ones? Thanks.
Maybe not exactly what you want but here is one way to only look at variable names:
proc tabulate data=sashelp.vcolumn; where libname='WORK' and memname in ("SET1" "SET2"); class memname name; table name, memname; run;
You will get a count of 1 under the Mename (data set name) when present.
Note that the value of Memname in sashelp.vcolumn is uppercase. So you would need to ensure that macro variables are in upper case.
This approach does have the nice feature that Proc Compare will never have: you could compare many sets at one time, just have each one in the where clause. If you wanted to look at all sets in a library then the "and memname in ..." part isn't needed.
You could ask for other things such as type to know if the variable is numeric or character, or labels to see if they have the same label.
SASHELP.Vcolum contains lots of information about all variables in all datasets in all libraries. So if you have many library/dataset combinations this may take a little while to run.
I think you want the LISTALL option on the PROC COMPARE statement.
Control the listing of variables and observations LISTALL lists all variables and observations that are found in only one data set.
This is show in the output the variables that do not appear in both datasets.
Or perhaps instead of comparing the original datasets you want to compare the output from proc contents?
Thanks for all your promt reply. I have 1000 variables, and 300 got rename. I did use proc content then merge to compare the name. It just I am curious if I could do this in one step by using proc compare.
Oh, so the goal is to get a list of variables that are in only one of the datasets, rather than comparing values of the data?
I like @Tom's idea of PROC COMPARE LISTALL. But I couldn't see an easy way to get an output dataset with the variables:
data myclass;
set sashelp.class(rename=(height=MyHeight));
run;
proc compare base=sashelp.class compare=myclass listall;
run;
Similar to @ballardw's approach, you use PROC SQL to read dictionary.columns and find the variables that are only in one dataset (but as noted before dictionary.columns gets BIG if you have a lot of libraries defined, so PROC CONTENTS out= followed by merge or SQL is certainly reasonable as well):
proc sql;
select libname, memname, name
from dictionary.columns
where (libname="SASHELP" and memname="CLASS")
or (libname="WORK" and memname="MYCLASS")
group by name
having count(*)=1
;
quit;
Even easier and very fast:
proc compare base =sashelp.class(obs=0)
compare=sashelp.class(drop=age obs=0) listall ;
run;
SAS Output
Listing of Variables in SASHELP.CLASS but not in SASHELP.CLASS Variable Type Length Age Num 8 |
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.