DATA Step, Macro, Functions and more

Subset using proc contents output

Accepted Solution Solved
Reply
Contributor
Posts: 54
Accepted Solution

Subset using proc contents output

Hi,

 

Is it possible to subset a data set based on the proc contents output. For e.g. I want to exclude all those variables which are a part of another data set .

 

Thanks in advance.


Accepted Solutions
Solution
‎12-13-2017 06:13 AM
PROC Star
Posts: 1,400

Re: Subset using proc contents output

[ Edited ]

If you insist on using PROC CONTENTS output then do like below. 

 

Otherwise, i agree with @KurtBremser, get the information from dictionary tables or sashelp.vtable instead.

 

proc contents noprint data=sashelp.heart out=VarsToDrop(keep=name);
run;

proc sql noprint;
	select name into :vars separated by ' '
	from VarsToDrop;
quit;

%put &vars.;

data ClassExludingVars;
	set sashelp.class;
	drop &vars.;
run;

 

View solution in original post


All Replies
Super User
Posts: 10,555

Re: Subset using proc contents output

Retrieve meta-information about datasets from dictionary.tables (SQL) or sashelp.vtable (data step). This will then help you in creating macro variables for use in drop or keep statements.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Solution
‎12-13-2017 06:13 AM
PROC Star
Posts: 1,400

Re: Subset using proc contents output

[ Edited ]

If you insist on using PROC CONTENTS output then do like below. 

 

Otherwise, i agree with @KurtBremser, get the information from dictionary tables or sashelp.vtable instead.

 

proc contents noprint data=sashelp.heart out=VarsToDrop(keep=name);
run;

proc sql noprint;
	select name into :vars separated by ' '
	from VarsToDrop;
quit;

%put &vars.;

data ClassExludingVars;
	set sashelp.class;
	drop &vars.;
run;

 

Trusted Advisor
Posts: 1,389

Re: Subset using proc contents output

Let's say you want all the variables in B that are not also in A:

 

data bvars_only;
  retain _sentinel1;
  if 0 then set a;
  retain _sentinel2;
  set b;
  drop _sentinel1--_sentinel2;
run;

 

This relies on how SAS constructs the program data vector. This program orders variables in the PDV from left to right as

  1. _sentinel1
  2. all the A vars
  3. _sentinel2
  4. all the B vars not already in item2 above.

The drop statement, using a double-dash connector tells SAS to drop all the vars in item1, item2, and item3.

Super User
Posts: 6,927

Re: Subset using proc contents output

[ Edited ]

Let's say you want to drop variables in B which also appear in A.  You could start with:

 

proc contents data=A noprint out=contents_A (keep=name);

run;

proc contents data=B noprint out=contents_B (keep=name);

run;

 

Now the issues are (1) finding variable names that appear in both, despite the fact that (2) capitalization of the names can be different in both data sets.  So:

 

data contents_A;

set contents_A;

name = upcase(name);

run;

proc sort data=contents_A;

by name;

run;

data contents_B;

set contents_B;

name = upcase(name);

run;

proc sort data=contents_B;

by name;

run;

 

Having a list of variable names from both sources (capitalized and sorted), the next step is to find those names appear in both:

 

data matching;

merge contents_A (in=ina) contents_B (in=inb);

by name;

if ina and inb;

run;

 

Having found the matches, extract the names into a macro variable:

 

proc sql;

select name into : matching_vars separated by ' ' from matching;

quit;

 

Finally, drop the variables:

 

data want;

set a (drop=&matching_vars);

run;

 

Unfortunately, it takes a long series of steps (at least the way that I'm picturing the problem requires many steps).

PROC Star
Posts: 274

Re: Subset using proc contents output

[ Edited ]

It is easier to use dictionary.columns to get the common variables:

proc sql noprint;
  select name into :dropvars separated by ' '
  from dictionary.columns
  where libname='SASHELP' and memname='CARS'
   and name in(select name from dictionary.columns where libname='SASHELP' and memname='CLASS');
  ;
quit;

data class_without_cars;
  set sashelp.class(drop=&dropvars);
run;

This will drop the variable WEIGHT, which is common to the two datasets. 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 367 views
  • 3 likes
  • 6 in conversation