Hello, I have a dataset where my exposure of interest is the level of a particular nutrient and the outcome is cancer. I wanted to find the level of nutrient at which splitting the cohort at that point had the lowest (most significant) p-value. An example of the interpretation of this level would be: people who consume more than 100 units of this nutrient have a 20% lower risk of cancer with a p-value of 0.003.
I created a macro to basically split the cohort at a specific level of the nutrient and run Proc PHREG to give me the hazard ratio's and associated P-values comparing the individuals with intake above the cutpoint to individuals below the cutpoint. I wrote a data step to run this macro at multiple different nutrient levels. I am still very new at SAS and was hoping to find out if there was a way to store the p-values and hazard ratios from the multiple Proc PHREG procedures, so I can use a data step to find the minimum P-value or even plot the p-values by the cutpoints.
Here is the macro:
%macro dichotomize(dsn,var,cutoff,i,dichotvar);
%LET iteration = &i;
data want&iteration;
set want_someday;
if &var =. then &dichotvar = .;
else if &var le &cutoff then &dichotvar=0;
else if &var gt &cutoff then &dichotvar=1;
run;
proc phreg data = want&iteration;
title "Colorectal-cancer incidence attempt";
class &dichotvar (ref="0") menopause_with_males (ref="0") race (ref="1") hhincome (ref="1")
enrollment_source (ref="C") fh_colorectalcancer (ref="0")
BMI_category_expanded (ref="0") alcohol_category (ref="0") smokestatus_packyear (ref="0") hei10_category (ref="0")
;
model eof_age_CRC*Inc_CRC(0) = &dichotvar
menopause_with_males race hhincome enrollment_source
BMI_category_expanded totalactivitymethr smokestatus_packyear Comorbidity_Index
ffq_kcal hei10_category alcohol_category fh_colorectalcancer
/ entry=enrollment_agemonths rl=wald;
run;
%mend dichotomize;
Hello @fikel,
I would insert this statement before the PROC PHREG step in the macro:
ods output ParameterEstimates=estim&i(where=(parameter="&dichotvar"));
This will create datasets ESTIM1, ESTIM2, ... (for &i=1, 2, ...) containing the parameter estimates, p-values and hazard ratios. (The dataset names "ESTIM1" etc. are user-defined.) The WHERE= dataset option is optional, but may be useful to focus on the parameter whose cutoff is under investigation. You can also add a KEEP= or DROP= dataset option to restrict the datasets to variables of interest.
These ESTIMn datasets can be combined to a single dataset for all iterations (e.g., for one parameter):
data est;
/* insert a suitable LENGTH statement here if warning "Multiple lengths ..." occurs */
set estim: indsname=dsn;
iteration=input(compress(dsn,,'kd'),32.);
run;
Instead of the shortcut estim: in the SET statement you can use a more explicit list of the form estim1-estim99 (which also might improve the sort order). You may want to add a variable CUTOFF to dataset EST (or the individual ESTIMn) in order to bring cutoff and p-values together.
Hello @fikel,
I would insert this statement before the PROC PHREG step in the macro:
ods output ParameterEstimates=estim&i(where=(parameter="&dichotvar"));
This will create datasets ESTIM1, ESTIM2, ... (for &i=1, 2, ...) containing the parameter estimates, p-values and hazard ratios. (The dataset names "ESTIM1" etc. are user-defined.) The WHERE= dataset option is optional, but may be useful to focus on the parameter whose cutoff is under investigation. You can also add a KEEP= or DROP= dataset option to restrict the datasets to variables of interest.
These ESTIMn datasets can be combined to a single dataset for all iterations (e.g., for one parameter):
data est;
/* insert a suitable LENGTH statement here if warning "Multiple lengths ..." occurs */
set estim: indsname=dsn;
iteration=input(compress(dsn,,'kd'),32.);
run;
Instead of the shortcut estim: in the SET statement you can use a more explicit list of the form estim1-estim99 (which also might improve the sort order). You may want to add a variable CUTOFF to dataset EST (or the individual ESTIMn) in order to bring cutoff and p-values together.
Thank you so much! This successfully created the dataset. In regard to adding a CUTOFF variable to the EST or ESTIMn, I am having a difficult time getting this added to the ESTIMn data sets. How do I accomplish this?
There are different ways to do this. One is to insert a DATA step into the macro, after the PROC PHREG step:
data estim&i;
set estim&i;
cutoff=&cutoff;
run;
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.