I'm doing my best to figure this out. It's confusing for me and I appreciate all your help. I have a partial solution (below) that utilizes the approach provided by @PaigeMiller.
proc sql noprint;
select name into :colnames separated by ' ' /* Selects the name variable, which would typically contain column names.; */
/* It then places the selected names into a macro variable named colnames, with each name separated by a space.; */
from dictionary.columns
/* This line specifies the source of the data, which is the dictionary.columns table in SAS. */
/* This is a special table that contains metadata about all available datasets, including column names. */
where
/* Age */
substr(name,4) eq 'AGE' or
/* Education */
substr(name,4) eq 'ED' or
/* BMI */
substr(name,4) eq 'BMI' or
/* Note: Neither BMI nor height nor weight are available in the data I have. */
/* Exam dates */
/* Not including exam date 35 (see below as J02EXDAT) */
substr(name,2) eq '02DATE' or
/* Smoking status */
substr(name,2) eq '10SMOKE'
and libname='CARDIA' /* Whatever libname is needed goes here, upper case */
and memname='C1260REQ06_29_2023' /* Your data set name goes here, upper case */
;
quit;
/***********************************************************************************************************/
/* Creating a subset. */
/* &colnames (from above) is included in the KEEP statement. */
/***********************************************************************************************************/
data cardia.go;
set CARDIA.C1260REQ06_29_2023;
keep
ID
SHORT_ID
SEX
race
/* Suffix Variables */
&colnames
/* Preffix Variables */
CENTER:
/* Per Lucia: The exam at Y35 was carried out in two phases. We are still discussing the best way to assemble a Reference File for this exam. */
/* We have variable J02EXDAT. */
J02EXDAT
/* TAKING DIABETES MEDS (1=NO,2=YES) at baseline */
FDIABMED
;
label J02EXDAT="DATE OF EXAM 10";
run;
The above syntax allows me to use a list based on a suffix in a KEEP statement to create a subset (per my original post).
I would also like to be able to subset by prefix (e.g., subset all the variables from a particular timepoint; "A" = baseline). @Tom offered an approach that looks like it might offer that, but I can't figure out how to go from the PROC SQL to a dataset. When I run a PROC PRINT on the "groups" dataset the variables are all under the column "names" and the other columns are all the metadata categories:
Using what I've adapted from @PaigeMiller above (because at least I understand that part), Is there a way to create a variable that would allow me to subset by prefix?
data cardia.go_a;
set cardia.go;
where prefix = "A";
run;
Thanks for your help.
... View more