BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
fikel
Obsidian | Level 7

Hello, I have a dataset where my exposure of interest is the level of a particular nutrient and the outcome is cancer. I wanted to find the level of nutrient at which splitting the cohort at that point had the lowest (most significant) p-value. An example of the interpretation of this level would be: people who consume more than 100 units of this nutrient have a 20% lower risk of cancer with a p-value of 0.003.

 

I created a macro to basically split the cohort at a specific level of the nutrient and run Proc PHREG to give me the hazard ratio's and associated P-values comparing the individuals with intake above the cutpoint to individuals below the cutpoint. I wrote a data step to run this macro at multiple different nutrient levels. I am still very new at SAS and was hoping to find out if there was a way to store the p-values and hazard ratios from the multiple Proc PHREG procedures, so I can use a data step to find the minimum P-value or even plot the p-values by the cutpoints.

 

Here is the macro:

%macro dichotomize(dsn,var,cutoff,i,dichotvar);
%LET iteration = &i;

		data want&iteration;
		set want_someday;
		
		       if &var =. then &dichotvar = .;
		  else if &var le &cutoff then &dichotvar=0;
		  else if &var gt &cutoff then &dichotvar=1;
		run;
		
		proc phreg data = want&iteration;
		title "Colorectal-cancer incidence attempt";
		
		class &dichotvar (ref="0") menopause_with_males (ref="0") race (ref="1") hhincome (ref="1")
				enrollment_source (ref="C") fh_colorectalcancer (ref="0")
				BMI_category_expanded (ref="0") alcohol_category (ref="0") smokestatus_packyear (ref="0") hei10_category (ref="0")
				; 
		model eof_age_CRC*Inc_CRC(0) = &dichotvar 
				menopause_with_males race hhincome enrollment_source 
				BMI_category_expanded totalactivitymethr smokestatus_packyear Comorbidity_Index 
				ffq_kcal hei10_category alcohol_category fh_colorectalcancer
				/ entry=enrollment_agemonths rl=wald;
		run;	
		
%mend dichotomize;

 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hello @fikel,

 

I would insert this statement before the PROC PHREG step in the macro:

ods output ParameterEstimates=estim&i(where=(parameter="&dichotvar"));

This will create datasets ESTIM1, ESTIM2, ... (for &i=1, 2, ...) containing the parameter estimates, p-values and hazard ratios. (The dataset names "ESTIM1" etc. are user-defined.) The WHERE= dataset option is optional, but may be useful to focus on the parameter whose cutoff is under investigation. You can also add a KEEP= or DROP= dataset option to restrict the datasets to variables of interest.

 

These ESTIMn datasets can be combined to a single dataset for all iterations (e.g., for one parameter):

data est;
/* insert a suitable LENGTH statement here if warning "Multiple lengths ..." occurs */
set estim: indsname=dsn;
iteration=input(compress(dsn,,'kd'),32.);
run;

Instead of the shortcut estim: in the SET statement you can use a more explicit list of the form estim1-estim99 (which also might improve the sort order). You may want to add a variable CUTOFF to dataset EST (or the individual ESTIMn) in order to bring cutoff and p-values together.

View solution in original post

3 REPLIES 3
FreelanceReinh
Jade | Level 19

Hello @fikel,

 

I would insert this statement before the PROC PHREG step in the macro:

ods output ParameterEstimates=estim&i(where=(parameter="&dichotvar"));

This will create datasets ESTIM1, ESTIM2, ... (for &i=1, 2, ...) containing the parameter estimates, p-values and hazard ratios. (The dataset names "ESTIM1" etc. are user-defined.) The WHERE= dataset option is optional, but may be useful to focus on the parameter whose cutoff is under investigation. You can also add a KEEP= or DROP= dataset option to restrict the datasets to variables of interest.

 

These ESTIMn datasets can be combined to a single dataset for all iterations (e.g., for one parameter):

data est;
/* insert a suitable LENGTH statement here if warning "Multiple lengths ..." occurs */
set estim: indsname=dsn;
iteration=input(compress(dsn,,'kd'),32.);
run;

Instead of the shortcut estim: in the SET statement you can use a more explicit list of the form estim1-estim99 (which also might improve the sort order). You may want to add a variable CUTOFF to dataset EST (or the individual ESTIMn) in order to bring cutoff and p-values together.

fikel
Obsidian | Level 7

Thank you so much! This successfully created the dataset. In regard to adding a CUTOFF variable to the EST or ESTIMn, I am having a difficult time getting this added to the ESTIMn data sets. How do I accomplish this?

FreelanceReinh
Jade | Level 19

There are different ways to do this. One is to insert a DATA step into the macro, after the PROC PHREG step:

  data estim&i;
  set estim&i;
  cutoff=&cutoff;
  run;

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 895 views
  • 1 like
  • 2 in conversation