Hi,
Is it possible to subset a data set based on the proc contents output. For e.g. I want to exclude all those variables which are a part of another data set .
Thanks in advance.
If you insist on using PROC CONTENTS output then do like below.
Otherwise, i agree with @Kurt_Bremser, get the information from dictionary tables or sashelp.vtable instead.
proc contents noprint data=sashelp.heart out=VarsToDrop(keep=name);
run;
proc sql noprint;
select name into :vars separated by ' '
from VarsToDrop;
quit;
%put &vars.;
data ClassExludingVars;
set sashelp.class;
drop &vars.;
run;
Retrieve meta-information about datasets from dictionary.tables (SQL) or sashelp.vtable (data step). This will then help you in creating macro variables for use in drop or keep statements.
If you insist on using PROC CONTENTS output then do like below.
Otherwise, i agree with @Kurt_Bremser, get the information from dictionary tables or sashelp.vtable instead.
proc contents noprint data=sashelp.heart out=VarsToDrop(keep=name);
run;
proc sql noprint;
select name into :vars separated by ' '
from VarsToDrop;
quit;
%put &vars.;
data ClassExludingVars;
set sashelp.class;
drop &vars.;
run;
Let's say you want all the variables in B that are not also in A:
data bvars_only;
retain _sentinel1;
if 0 then set a;
retain _sentinel2;
set b;
drop _sentinel1--_sentinel2;
run;
This relies on how SAS constructs the program data vector. This program orders variables in the PDV from left to right as
The drop statement, using a double-dash connector tells SAS to drop all the vars in item1, item2, and item3.
Let's say you want to drop variables in B which also appear in A. You could start with:
proc contents data=A noprint out=contents_A (keep=name);
run;
proc contents data=B noprint out=contents_B (keep=name);
run;
Now the issues are (1) finding variable names that appear in both, despite the fact that (2) capitalization of the names can be different in both data sets. So:
data contents_A;
set contents_A;
name = upcase(name);
run;
proc sort data=contents_A;
by name;
run;
data contents_B;
set contents_B;
name = upcase(name);
run;
proc sort data=contents_B;
by name;
run;
Having a list of variable names from both sources (capitalized and sorted), the next step is to find those names appear in both:
data matching;
merge contents_A (in=ina) contents_B (in=inb);
by name;
if ina and inb;
run;
Having found the matches, extract the names into a macro variable:
proc sql;
select name into : matching_vars separated by ' ' from matching;
quit;
Finally, drop the variables:
data want;
set a (drop=&matching_vars);
run;
Unfortunately, it takes a long series of steps (at least the way that I'm picturing the problem requires many steps).
It is easier to use dictionary.columns to get the common variables:
proc sql noprint; select name into :dropvars separated by ' ' from dictionary.columns where libname='SASHELP' and memname='CARS' and name in(select name from dictionary.columns where libname='SASHELP' and memname='CLASS'); ; quit; data class_without_cars; set sashelp.class(drop=&dropvars); run;
This will drop the variable WEIGHT, which is common to the two datasets.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.