BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
fikel
Obsidian | Level 7

Hello, I have a dataset where my exposure of interest is the level of a particular nutrient and the outcome is cancer. I wanted to find the level of nutrient at which splitting the cohort at that point had the lowest (most significant) p-value. An example of the interpretation of this level would be: people who consume more than 100 units of this nutrient have a 20% lower risk of cancer with a p-value of 0.003.

 

I created a macro to basically split the cohort at a specific level of the nutrient and run Proc PHREG to give me the hazard ratio's and associated P-values comparing the individuals with intake above the cutpoint to individuals below the cutpoint. I wrote a data step to run this macro at multiple different nutrient levels. I am still very new at SAS and was hoping to find out if there was a way to store the p-values and hazard ratios from the multiple Proc PHREG procedures, so I can use a data step to find the minimum P-value or even plot the p-values by the cutpoints.

 

Here is the macro:

%macro dichotomize(dsn,var,cutoff,i,dichotvar);
%LET iteration = &i;

		data want&iteration;
		set want_someday;
		
		       if &var =. then &dichotvar = .;
		  else if &var le &cutoff then &dichotvar=0;
		  else if &var gt &cutoff then &dichotvar=1;
		run;
		
		proc phreg data = want&iteration;
		title "Colorectal-cancer incidence attempt";
		
		class &dichotvar (ref="0") menopause_with_males (ref="0") race (ref="1") hhincome (ref="1")
				enrollment_source (ref="C") fh_colorectalcancer (ref="0")
				BMI_category_expanded (ref="0") alcohol_category (ref="0") smokestatus_packyear (ref="0") hei10_category (ref="0")
				; 
		model eof_age_CRC*Inc_CRC(0) = &dichotvar 
				menopause_with_males race hhincome enrollment_source 
				BMI_category_expanded totalactivitymethr smokestatus_packyear Comorbidity_Index 
				ffq_kcal hei10_category alcohol_category fh_colorectalcancer
				/ entry=enrollment_agemonths rl=wald;
		run;	
		
%mend dichotomize;

 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hello @fikel,

 

I would insert this statement before the PROC PHREG step in the macro:

ods output ParameterEstimates=estim&i(where=(parameter="&dichotvar"));

This will create datasets ESTIM1, ESTIM2, ... (for &i=1, 2, ...) containing the parameter estimates, p-values and hazard ratios. (The dataset names "ESTIM1" etc. are user-defined.) The WHERE= dataset option is optional, but may be useful to focus on the parameter whose cutoff is under investigation. You can also add a KEEP= or DROP= dataset option to restrict the datasets to variables of interest.

 

These ESTIMn datasets can be combined to a single dataset for all iterations (e.g., for one parameter):

data est;
/* insert a suitable LENGTH statement here if warning "Multiple lengths ..." occurs */
set estim: indsname=dsn;
iteration=input(compress(dsn,,'kd'),32.);
run;

Instead of the shortcut estim: in the SET statement you can use a more explicit list of the form estim1-estim99 (which also might improve the sort order). You may want to add a variable CUTOFF to dataset EST (or the individual ESTIMn) in order to bring cutoff and p-values together.

View solution in original post

3 REPLIES 3
FreelanceReinh
Jade | Level 19

Hello @fikel,

 

I would insert this statement before the PROC PHREG step in the macro:

ods output ParameterEstimates=estim&i(where=(parameter="&dichotvar"));

This will create datasets ESTIM1, ESTIM2, ... (for &i=1, 2, ...) containing the parameter estimates, p-values and hazard ratios. (The dataset names "ESTIM1" etc. are user-defined.) The WHERE= dataset option is optional, but may be useful to focus on the parameter whose cutoff is under investigation. You can also add a KEEP= or DROP= dataset option to restrict the datasets to variables of interest.

 

These ESTIMn datasets can be combined to a single dataset for all iterations (e.g., for one parameter):

data est;
/* insert a suitable LENGTH statement here if warning "Multiple lengths ..." occurs */
set estim: indsname=dsn;
iteration=input(compress(dsn,,'kd'),32.);
run;

Instead of the shortcut estim: in the SET statement you can use a more explicit list of the form estim1-estim99 (which also might improve the sort order). You may want to add a variable CUTOFF to dataset EST (or the individual ESTIMn) in order to bring cutoff and p-values together.

fikel
Obsidian | Level 7

Thank you so much! This successfully created the dataset. In regard to adding a CUTOFF variable to the EST or ESTIMn, I am having a difficult time getting this added to the ESTIMn data sets. How do I accomplish this?

FreelanceReinh
Jade | Level 19

There are different ways to do this. One is to insert a DATA step into the macro, after the PROC PHREG step:

  data estim&i;
  set estim&i;
  cutoff=&cutoff;
  run;

 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1400 views
  • 1 like
  • 2 in conversation