BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
fikel
Obsidian | Level 7

Hello, I have a dataset where my exposure of interest is the level of a particular nutrient and the outcome is cancer. I wanted to find the level of nutrient at which splitting the cohort at that point had the lowest (most significant) p-value. An example of the interpretation of this level would be: people who consume more than 100 units of this nutrient have a 20% lower risk of cancer with a p-value of 0.003.

 

I created a macro to basically split the cohort at a specific level of the nutrient and run Proc PHREG to give me the hazard ratio's and associated P-values comparing the individuals with intake above the cutpoint to individuals below the cutpoint. I wrote a data step to run this macro at multiple different nutrient levels. I am still very new at SAS and was hoping to find out if there was a way to store the p-values and hazard ratios from the multiple Proc PHREG procedures, so I can use a data step to find the minimum P-value or even plot the p-values by the cutpoints.

 

Here is the macro:

%macro dichotomize(dsn,var,cutoff,i,dichotvar);
%LET iteration = &i;

		data want&iteration;
		set want_someday;
		
		       if &var =. then &dichotvar = .;
		  else if &var le &cutoff then &dichotvar=0;
		  else if &var gt &cutoff then &dichotvar=1;
		run;
		
		proc phreg data = want&iteration;
		title "Colorectal-cancer incidence attempt";
		
		class &dichotvar (ref="0") menopause_with_males (ref="0") race (ref="1") hhincome (ref="1")
				enrollment_source (ref="C") fh_colorectalcancer (ref="0")
				BMI_category_expanded (ref="0") alcohol_category (ref="0") smokestatus_packyear (ref="0") hei10_category (ref="0")
				; 
		model eof_age_CRC*Inc_CRC(0) = &dichotvar 
				menopause_with_males race hhincome enrollment_source 
				BMI_category_expanded totalactivitymethr smokestatus_packyear Comorbidity_Index 
				ffq_kcal hei10_category alcohol_category fh_colorectalcancer
				/ entry=enrollment_agemonths rl=wald;
		run;	
		
%mend dichotomize;

 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hello @fikel,

 

I would insert this statement before the PROC PHREG step in the macro:

ods output ParameterEstimates=estim&i(where=(parameter="&dichotvar"));

This will create datasets ESTIM1, ESTIM2, ... (for &i=1, 2, ...) containing the parameter estimates, p-values and hazard ratios. (The dataset names "ESTIM1" etc. are user-defined.) The WHERE= dataset option is optional, but may be useful to focus on the parameter whose cutoff is under investigation. You can also add a KEEP= or DROP= dataset option to restrict the datasets to variables of interest.

 

These ESTIMn datasets can be combined to a single dataset for all iterations (e.g., for one parameter):

data est;
/* insert a suitable LENGTH statement here if warning "Multiple lengths ..." occurs */
set estim: indsname=dsn;
iteration=input(compress(dsn,,'kd'),32.);
run;

Instead of the shortcut estim: in the SET statement you can use a more explicit list of the form estim1-estim99 (which also might improve the sort order). You may want to add a variable CUTOFF to dataset EST (or the individual ESTIMn) in order to bring cutoff and p-values together.

View solution in original post

3 REPLIES 3
FreelanceReinh
Jade | Level 19

Hello @fikel,

 

I would insert this statement before the PROC PHREG step in the macro:

ods output ParameterEstimates=estim&i(where=(parameter="&dichotvar"));

This will create datasets ESTIM1, ESTIM2, ... (for &i=1, 2, ...) containing the parameter estimates, p-values and hazard ratios. (The dataset names "ESTIM1" etc. are user-defined.) The WHERE= dataset option is optional, but may be useful to focus on the parameter whose cutoff is under investigation. You can also add a KEEP= or DROP= dataset option to restrict the datasets to variables of interest.

 

These ESTIMn datasets can be combined to a single dataset for all iterations (e.g., for one parameter):

data est;
/* insert a suitable LENGTH statement here if warning "Multiple lengths ..." occurs */
set estim: indsname=dsn;
iteration=input(compress(dsn,,'kd'),32.);
run;

Instead of the shortcut estim: in the SET statement you can use a more explicit list of the form estim1-estim99 (which also might improve the sort order). You may want to add a variable CUTOFF to dataset EST (or the individual ESTIMn) in order to bring cutoff and p-values together.

fikel
Obsidian | Level 7

Thank you so much! This successfully created the dataset. In regard to adding a CUTOFF variable to the EST or ESTIMn, I am having a difficult time getting this added to the ESTIMn data sets. How do I accomplish this?

FreelanceReinh
Jade | Level 19

There are different ways to do this. One is to insert a DATA step into the macro, after the PROC PHREG step:

  data estim&i;
  set estim&i;
  cutoff=&cutoff;
  run;

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1896 views
  • 1 like
  • 2 in conversation