SAS Version 9.4
Hello Everyone, I have approximately 20 data sets and I would like to subset a few variables that could are present in all data sets. I am not sure if the time varying variable have the same root name
[For example height would be at visit 1 height1 and at visit 2 height2 (or maybe v2height)]
so I would like a very flexible code that I could write that would pull and output the variables to a separate data set and their corresponding observations .
These are very basic examples of the data.
Data set 1 variables
Name
ID
EXAMDATE
HEIGHT1
WEIGHT1
BODYFAT1
ROOM
SCHOOL
CITY
MOMNAME
DADNAME
SIBLINGS
EYECOLOR
Data set 2 variable
NAME
ID
EXAMDATE
HEIGHT2
WEIGHT2
BODYFAT2
ROOM
SCHOOL
CITY
MOMNAME
DADNAME
SIBLINGS
HAIRCOLOR
Y2COLOR_EYES
I could do an simple code for each data set such as
data dataset1_new;
set dataset1;
Keep Name ID EXAMDATE HEIGHT1 WEIGHT1 BODYFAT1;
run;
but I wondered if there was a way to ask SAS to keep only the variable names included in a list but automate it so I do not have to change it for each data set.
something along the lines of
if in: (“ID”, “EXAMDATE”, “HEIGHT”, “WEIGHT”, “BODYFAT”, “EYECOLOR”, “Y2COLOR_EYES”) then OUTPUT;
Download the macro %ut_varlist described in
https://communities.sas.com/t5/SAS-Programming/Macro-function-to-return-a-list-of-variables-in-a-tab...
%let base_vars=
NAME ID EXAMDATE HEIGHT WEIGHT BODYFAT ROOM SCHOOL CITY MOMNAME DADNAME SIBLINGS HAIRCOLOR COLOR_EYES;
%let keep_vars1=%ut_varlist(table=test1, contain=&base_vars.);
%let keep_vars2=%ut_varlist(table=test2, contain=&base_vars.);
etc
If it works for you please like the above posting!
Are you just looking for the DKRICOND and/or DKROCOND options?
%let varlist=Name ID EXAMDATE HEIGHT1 WEIGHT1 BODYFAT1 ;
options dkricond=nowarn;
data want1;
set have1(keep=&varlist);
run;
data want2;
set have2(keep=&varlist);
run;
Nice options, but I personally am moving in the opposite direction, such as using the NOTE2ERR option, since that forces me to immediately avoid having columns created that are not spelled correctly, etc.
As a consultant , I have recently been migrating some code, and found logs that are riddled with notes about missing values, etc. which could cause problems / create missing values in some cases.
Look at sas help.vcolumns, it contains all variables of all datasets in libraries currently assigned. If you can define a where-statement for each relevant dataset selecting the variables you want, then you are close to a solution.
proc sql noprint;
select name into :keepList separated by " "
from sashelp.vcolumn
where libname = "LIBRARY" and memname = "DATASET"
and /* insert filter here */
;
quit;
Then use &varlist in a keep-statement.
EDIT: The code is now in a code-box.
If you have a number of data sets sharing the same base variable name, meaning the names only vary slightly in suffix, you can use a name prefix list. The list is indicated by using the colon (:) character after the variable name.
Keep
Name ID EXAMDATE
HEIGHT: /* all vars whose name starts with height */
WEIGHT: /* all vars whose name starts with weight */
BODYFAT: /* all vars whose name starts with bodyfat */
;
If you want some magic system that finds which columns among many tables are similarly named, that is another question, or this one needs more explanation.
The colon syntax does not work because some columns like have a preceding Y2. The magic system is the %ut_varlist macro in my previous reply 🙂
The parameter CONTAIN can list any number of core values and may need to use alternatives.
To expand on my initial reply:
I modified by base_vars to simply include COLOR, as this will select the columns EYE_COLOR, COLOREYE, Y2EYE_COLOR and HAIRCOLOR, whichever exist in the specified table. Maybe NAME would also be good, as it selects any existing columns like DADNAME, MOMNAME, MOM_NAME, DAD_NAME, DADNAME3 etc.
%let base_vars=NAME ID EXAMDATE EXAM_DATE HEIGHT WEIGHT BODYFAT ROOM SCHOOL CITY NAME SIBLINGS COLOR ;
%let keep_vars1=%ut_varlist(table=test1, contain=&base_vars.);
%let keep_vars2=%ut_varlist(table=test2, contain=&base_vars.);
etc.
With all these name variants in different tables, do you ever plan to combine the tables and normalize the names through some mapping ?
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.