BookmarkSubscribeRSS Feed
ajb
Obsidian | Level 7 ajb
Obsidian | Level 7

SAS Version 9.4

 

Hello Everyone, I have approximately 20 data sets and I would like to subset a few variables that could are present in all data sets. I am not sure if the time varying variable have the same root name  

[For example height would be at visit 1 height1 and at visit 2 height2 (or maybe v2height)]

so I would like a very flexible code that I could write that would pull and output the variables to a separate data set and their corresponding observations .

 

These are very basic examples of the data.

 

Data set 1 variables

Name

ID

EXAMDATE

HEIGHT1

WEIGHT1

BODYFAT1

ROOM

SCHOOL

CITY

MOMNAME

DADNAME

SIBLINGS

EYECOLOR

 

Data set 2 variable

NAME

ID

EXAMDATE

HEIGHT2

WEIGHT2

BODYFAT2

ROOM

SCHOOL

CITY

MOMNAME

DADNAME

SIBLINGS

HAIRCOLOR

Y2COLOR_EYES

 

I could do an simple code for each data set such as

 

data dataset1_new;

   set dataset1;

Keep Name  ID EXAMDATE HEIGHT1 WEIGHT1 BODYFAT1;

run;

 

but I wondered if there was a way to ask SAS to keep only the variable names included in a list but automate it so I do not have to change it for each data set.

 

something along the lines of

if in: (“ID”, “EXAMDATE”, “HEIGHT”, “WEIGHT”, “BODYFAT”, “EYECOLOR”, “Y2COLOR_EYES”) then OUTPUT;

 

7 REPLIES 7
DavePrinsloo
Pyrite | Level 9

Download the macro %ut_varlist described in
https://communities.sas.com/t5/SAS-Programming/Macro-function-to-return-a-list-of-variables-in-a-tab...

 

%let base_vars=

NAME ID EXAMDATE HEIGHT WEIGHT BODYFAT ROOM SCHOOL CITY MOMNAME DADNAME SIBLINGS  HAIRCOLOR COLOR_EYES;

%let keep_vars1=%ut_varlist(table=test1, contain=&base_vars.);

%let keep_vars2=%ut_varlist(table=test2, contain=&base_vars.);

etc

If it works for you please like the above posting!

Tom
Super User Tom
Super User

Are you just looking for the DKRICOND and/or DKROCOND options?

%let varlist=Name  ID EXAMDATE HEIGHT1 WEIGHT1 BODYFAT1 ;
options dkricond=nowarn;
data want1;
  set have1(keep=&varlist);
run;
data want2;
  set have2(keep=&varlist);
run;

 

DavePrinsloo
Pyrite | Level 9

Nice options, but I personally am moving in the opposite direction, such as using the NOTE2ERR option, since that forces me  to immediately avoid having columns created that are not spelled correctly, etc.  

As a consultant , I have recently been migrating some code, and found logs that are riddled with notes about missing values, etc. which could cause problems / create missing values in some cases.

 

andreas_lds
Jade | Level 19

Look at sas help.vcolumns, it contains all variables of all datasets in libraries currently assigned. If you can define a where-statement for each relevant dataset selecting the variables you want, then you are close to a solution.

proc sql noprint;
  select name into :keepList separated by " "
    from sashelp.vcolumn
    where libname = "LIBRARY" and memname = "DATASET" 
      and /* insert filter here */
  ;
quit;


Then use &varlist in a keep-statement.

EDIT: The code is now in a code-box.

RichardDeVen
Barite | Level 11

If you have a number of data sets sharing the same base variable name, meaning the names only vary slightly in suffix, you can use a name prefix list.  The list is indicated by using the colon (:) character after the variable name.

 

Keep 
  Name  ID  EXAMDATE
  HEIGHT:   /* all vars whose name starts with height */
  WEIGHT:  /* all vars whose name starts with weight */
  BODYFAT:  /* all vars whose name starts with bodyfat */
;

If you want some magic system that finds which columns among many tables are similarly  named, that is another question, or this one needs more explanation.

DavePrinsloo
Pyrite | Level 9

The colon syntax does not work because some columns like have a preceding Y2. The magic system is the %ut_varlist macro in my previous reply 🙂

The parameter CONTAIN can list any number of core values and may need to use alternatives.

To expand on my initial reply:

I modified by base_vars to simply include COLOR, as this will select the columns EYE_COLOR, COLOREYE, Y2EYE_COLOR and HAIRCOLOR, whichever exist in the specified table.  Maybe NAME would also be good, as it selects any existing columns like DADNAME, MOMNAME, MOM_NAME, DAD_NAME, DADNAME3 etc.

 

%let base_vars=NAME ID EXAMDATE EXAM_DATE HEIGHT WEIGHT BODYFAT ROOM SCHOOL CITY NAME SIBLINGS  COLOR  ;
%let keep_vars1=%ut_varlist(table=test1, contain=&base_vars.);
%let keep_vars2=%ut_varlist(table=test2, contain=&base_vars.);
etc.

 

RichardDeVen
Barite | Level 11

With all these name variants in different tables, do you ever plan to combine the tables and normalize the names through some mapping ?

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1955 views
  • 8 likes
  • 5 in conversation