BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Danglytics
Calcite | Level 5

Hi,

I hope someone could help me with this problem:

I have a data file a.data with accounts and their attributes across ~100 variables

I have a list of variables – as a separate file – a.vars,which contains a subset of 100 variables (let’s say 20 of the 100)

I need to code a process that would drop from a.data all variables that are not in a.vars file. It has to be SAS process (not manualwork through EXCEL), that just does it in a completely automated manner.

a.data:

id var1 var2 var3 var4 var5

1 0 1 1 0 1 0

..

a.vars:

var1

var2

var3

var6

var7

new.data

id var1 var2 var3

1 0 1 1

Thanks for your help.

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Note : Do not use the period character in SAS dataset names. That's just asking for trouble.

data a_data;
input id var1 var2 var3 var4 var5 var6 var7;
datalines;
1 0 1 1 0 1 0 1 1
;

data a_vars;
input var $;
datalines;
var1
var2
var3
var6
var7
;

proc sql noprint;
select var into :varList separated by " " from a_vars;

data want(keep=&varList); set a_data; run;

proc print; run;

PG

PG

View solution in original post

8 REPLIES 8
Reeza
Super User

Something like the following should work (untested).

You could convert it into a macro if required based on the libname, dataset name and output name easily.

proc sql;

*get columns in your dataset;

     create table vnames as

select name from dictionary.columns

where libname=your_libname and memname=yourdataset;

*Get overlap;

select name into :names separated by " " from vnames

where name in (select names from data name_list);

quit;

data new_data;

set <your_data>;

keep &names.;

run;

*Corrected as noted by ballardw Smiley Happy

ballardw
Super User

Reeza forgot to include the input dataset.

Data new_data;

     set <yourdataset>;

     keep &names;

run;

Danglytics
Calcite | Level 5

thank you for your replies.

proc sql;

*get columns in your dataset;

     create table vnames as

select name from dictionary.columns

where libname=your_libname and memname=yourdataset;

do i have to select the variable names in this statement?

Reeza
Super User

No, it grabs the names of the variables in your dataset.

You only need to specify the library and dataset name in the where clause.

Then the dataset to pull the the names you want to keep in the next query.

PGStats
Opal | Level 21

Note : Do not use the period character in SAS dataset names. That's just asking for trouble.

data a_data;
input id var1 var2 var3 var4 var5 var6 var7;
datalines;
1 0 1 1 0 1 0 1 1
;

data a_vars;
input var $;
datalines;
var1
var2
var3
var6
var7
;

proc sql noprint;
select var into :varList separated by " " from a_vars;

data want(keep=&varList); set a_data; run;

proc print; run;

PG

PG
Reeza
Super User

You'd get at least a warning message in your log with this method, as not all variables might be in the dataset.

FriedEgg
SAS Employee

data foo;

array var[10];

do id=1 to 100;

  do _n_=1 to dim(var);

   var[_n_]=rand('tabled',.5);

  end;

  output;

end;

run;

data bar;

input var $;

cards;

var1

var3

var5

var7

var9

;

run;

proc sql noprint;

select b.var into :drops separated by ','

   from bar b,

        dictionary.columns a

  where b.var=a.name;

alter table foo drop column &drops;

quit;

Astounding
PROC Star

If you are sure that every variable named in A.VARS actually appears in  A.DATA, here is the method I like.  This assumes that VARNAME is the name of the one and only variable in A.VARS:

data _null_;

   call execute('data new.data; set a.data (keep=');

   do until (done);

         set a.vars end=done;

         call execute(varname);

   end;

   call execute('); run;');

   stop;

run;

The log will show you the statements that get generated.  Also, note that ID will be dropped if it is not named in A.VARS.

Good luck.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1350 views
  • 3 likes
  • 6 in conversation