- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I have a large data frame already formatted the way I desire with several thousand columns called df1.
col1 col2 col3 .... col5000
1 1 1 .... 1
1 0 0 .... 1
1 1 0 .... 1
... .... ... .... ....
I've selected a few hundred of these columns to use as features and have the list stored in a separate dataframe, df2.
ColN
col1
col3
...
col5000
The list is approx. 1000 entries and are not sequential. How can I select the columns from df1 so that my final result is something like
col1 col3 .... col5000
1 1 .... 1
1 0 .... 1
1 0 .... 1
.... .... .... ....
Currently I've tried something like this
proc sql;
create table
df_subset as
select
A.* in(B.colN)
from
df1 as A,
df2 as B
quit;
I am currently working in SAS studio
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
proc sql noprint;
select distinct colN into :wanted_columns separated by ' ' from df2;
quit;
data df_subset;
set df1(keep=&wanted_columns);
run;
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@rschubert1 Your code doesn't work as you try to use metadata where SQL expects data.
The method used by @PaigeMiller works.
Similarly:
proc sql noprint;
select distinct colN into :wanted_columns separated by ',' from DF2;
create table SUBSET as select &wanted_columns from DF1;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
PaigeMiller's solution does this, convert the data frame into a variable that can be used in the next steps.