Re: Select variables from dataframe1 using a list of column names in d...

rschubert1 · Posted 11-09-2019 03:54 PM

I have a large data frame already formatted the way I desire with several thousand columns called df1.

col1 col2 col3 .... col5000

1 1 1 .... 1

1 0 0 .... 1

1 1 0 .... 1

... .... ... .... ....

I've selected a few hundred of these columns to use as features and have the list stored in a separate dataframe, df2.

ColN

col1

col3

...

col5000

The list is approx. 1000 entries and are not sequential. How can I select the columns from df1 so that my final result is something like

col1 col3 .... col5000

1 1 .... 1

1 0 .... 1

.... .... .... ....

Currently I've tried something like this

proc sql;
create table 
	df_subset as
select 
	A.* in(B.colN) 
from 
	df1 as A, 
	df2 as B
quit;

I am currently working in SAS studio

PaigeMiller · Posted 11-09-2019 05:17 PM

proc sql noprint;
    select distinct colN into :wanted_columns separated by ' ' from df2;
quit;

data df_subset;
    set df1(keep=&wanted_columns);
run;

--
Paige Miller

ChrisNZ · Posted 11-09-2019 05:34 PM

@rschubert1 Your code doesn't work as you try to use metadata where SQL expects data.

The method used by @PaigeMiller works.

Similarly:

proc sql noprint;
  select distinct colN into :wanted_columns separated by ',' from DF2;
  create table SUBSET as select &wanted_columns from DF1;
run;

High-Performance SAS Coding - Third Edition

Reeza · Posted 11-09-2019 06:41 PM

The term dataframe tells me you're coming from R or Python, SAS uses data sets. Either way, you need to convert that list to a macro variable, similar to creating it as a list in R/Python and then add that to your data frame in a KEEP statement. If you're familiar with R, KEEP is similar in functionality to the SELECT operator in Tidyverse.

PaigeMiller's solution does this, convert the data frame into a variable that can be used in the next steps.

Select variables from dataframe1 using a list of column names in dataframe2

Re: Select variables from dataframe1 using a list of column names in dataframe2

Re: Select variables from dataframe1 using a list of column names in dataframe2

Re: Select variables from dataframe1 using a list of column names in dataframe2

Catch up on SAS Innovate 2026

Select variables from dataframe1 using a list of column names in dataframe2

Re: Select variables from dataframe1 using a list of column names in dataframe2

Re: Select variables from dataframe1 using a list of column names in dataframe2

Re: Select variables from dataframe1 using a list of column names in dataframe2

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away