I have a dataset with over thousands of columns and millions of records. I don't need all the columns (need all the rows) and want to subset my dataset by keeping columns that have specific labels. So, the subsetting is not based on column or var names.... it's based on their labels. Any idea how to do it efficiently?
Thanks
Do the labels of the variables you want to keep (or drop) have any unique characteristics such as a word or phrase(s) that do not occur in the labels of the other variables?
Or do you have an existing list of the labels you want to keep?
Yes. I'm looking for specific label (contains specific term). But the variables with this label are distributed across all vars.
Create a macro variable with the variable names by querying the dictionary.table with search for the specific label column.
proc sql;
select name into :vlist separated by " "
from dictionary.tables where libname ='SASHELP" and memname="CLASS" and upcase(label) contains "YOUR CRITERIA";
quit;
data want;
set have (keep= &vlist);
run;
Sorry, but I'm a newbie to SAS... Not very familiar with dictionary.tables. When I ran your code, I got ths error:
ERROR: The following columns were not found in the contributing tables: name.
By the way, I'm running SAS in command line (no gui).
Try googling dictionary tables sas and you'll find a lot of references to them.
SAS(R) 9.2 Language Reference: Concepts, Second Edition
Sorry, I pointed you to the wrong table. It is the sashelp.vcolumn or dictionary.columns table instead.
You can run a proc contents on this table to see the column names and even a proc print to see some of the data.
Check
proc sql;
select name into :vlist separated by " "
from dictionary.columns where libname ='SASHELP" and memname="CLASS" and upcase(label) contains "YOUR CRITERIA";
quit;
data want;
set have (keep= &vlist);
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.