I am doing a Net Reclassification Index (NRI) analysis to compare two Cox PH models. I found a SAS NRI macro from https://ncook.bwh.harvard.edu/assets/survmacs.v4.sas. I used SURV_CONTNRI of these macros. However, my SAS code did not work. The code was attached. Could you help me figure it out? Many thanks!
@SeaMoon_168 wrote:
Many thanks. Your code worked well with the Myeloma data. However, it did not work using my own data. I attached my data and simply compare base model: DM1 and extended model: DM1 DM2. Could you help me figure it out?
You're welcome. Not sure what exactly "did not work" on your side without seeing your code, log or output. But here is the code that does work with your sample data (which I read into a work dataset SAMPLE from a CSV file created from your Excel file on my private computer, as there is no Excel on my SAS workstation) with the changes highlighted in bold:
/* Create subject ID */ data one; set sample; _id_=_n_; run; /* Set (arbitrary) reference values for predictor variables */ proc summary data=one; var DM:; output out=ref(drop=_:) min=; run; /* Create the "dummy observation" as suggested in NRI.sas and set the time point of interest */ data two; _id_=0; VStatus=.; Time=2.5; set ref; run; /* Create input dataset for macro PREDMAC */ data have; set two one; run; /* Call macro PREDMAC with two different Cox models and save the predicted event probabilities */ %PREDMAC(have, _id_, VStatus, Death, Time, DM1, DM1, Base model, probt1) data mod1; set preddat; run; %PREDMAC(have, _id_, VStatus, Death, Time, DM1 DM2, DM1 DM2, Extended model, probt2) data mod2; set preddat; run; /* Create input dataset for macro SURV_CONTNRI */ data combo; merge mod1(keep=_id_ VStatus Time probt1) mod2(keep=_id_ probt2); by _id_; run; /* Estimate NRI among other statistics */ %SURV_CONTNRI(combo,2,VStatus,Time,2.5,probt1,probt2)
The main result is nri=0.30344... It depends, however, on whether variable DM1 (with values 0, 1, 2, 3) is treated as a CLASS variable by PROC PHREG (as above). If it is treated as a continuous variable, I obtain nri=0.26082... with the PREDMAC macro calls changed to
%PREDMAC(have, _id_, VStatus, Death, Time, , DM1, Base model, probt1) ... %PREDMAC(have, _id_, VStatus, Death, Time, DM2, DM1 DM2, Extended model, probt2)
As expected, treating DM2 (with values 0, 1) as a continuous variable (i.e., leaving the CLASSVARS parameter of macro PREDMAC blank also in the second call) makes no difference.
"Did not work" is awful vague.
Are there errors in the log?: Post the code and log in a code box opened with the "</>" to maintain formatting of error messages.
No output? Post any log in a code box.
Unexpected output? Provide input data in the form of data step code pasted into a code box, the actual results and the expected results. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the "</>" icon or attached as text to show exactly what you have and that we can test code against.
Since your issue involves macro code then you would want to set:
Options MPRINT;
before executing the code. That option will send the resolved created code to the log in much more detail and have any warnings or errors appear in better relationship to the code generating them.
Set Options nomprint; to reset to default.
One thing that bothers me a bit is multiple uses of the *text; style comments. In some situations such can be treated as executable code inside a macro. To use statement style comments in a macro the line should be prefaced with an % as in %* text; Or use the /* text */ style comment.
Thank you for your advice. The log was attached. Could you help me figure out what is wrong with my SAS code?
The problem why you get no output comes from this step:
232 %SURV_CONTNRI(newdata,2,status,time,4,surv1,surv2); MPRINT(SURV_CONTNRI): title2 Continuous NRI for Event status at Time 4 : surv2 vs surv1 ; MPRINT(SURV_CONTNRI): title3 "Need to Bootstrap for SE"; MPRINT(SURV_CONTNRI): data nri1; MPRINT(SURV_CONTNRI): set newdata; MPRINT(SURV_CONTNRI): _id_=_N_; MPRINT(SURV_CONTNRI): * compute diffs in probs; MPRINT(SURV_CONTNRI): event=status; MPRINT(SURV_CONTNRI): pyrs=time; MPRINT(SURV_CONTNRI): prob1=surv1; MPRINT(SURV_CONTNRI): prob2=surv2; MPRINT(SURV_CONTNRI): diffp=prob2-prob1; MPRINT(SURV_CONTNRI): if prob1>. and prob2>.; MPRINT(SURV_CONTNRI): keep _id_ event pyrs prob1 prob2 diffp; MPRINT(SURV_CONTNRI): run; NOTE: Variable surv1 is uninitialized. NOTE: Variable surv2 is uninitialized.
When those two variables are uninitialized then the values are missing for Prob1 and Prob2 so no output records are written.
Your call to this macro is using your just created data set NEWDATA from proc import and it does not contain variables Surv1 or Surv2.
I don't see very much documentation on these macros, such as what Prob1 or Prob2 are supposed to look like but with descriptions like : PROB1 = probability for model 1 look like they are supposed to hold output from a model, possibly that proc phreg output but I can't say as there is no documentation of that. I would guess that there is a step or two that you are supposed to do that generates those model probabilities and likely combines some data sets somewhere to have the output probabilities for two separate models in one data set.
Sometimes people write these so that the output of one macro is used as the input to another but I can't tell if that is the case here or what other steps are needed to make your raw data usable for these macros.
IF your raw data is supposed to be usable with these macros then you need to determine which variables in the data set align with the macro. Perhaps DM1 is supposed to be Prob1 and DM2 is supposed to be Prob2.
Note that the Proc import step has these variables:
VAR1 $
time
status
DM1
DM2
Good luck.
As an aside, I wouldn't say your code is particularly "wrong" but that the author of the macros was more than a little obtuse about providing exactly what the input data set(s) for these should look like.
Thank you for your help. I made some changes from another expert. A new SAS macro was added named "%macro SURV_CONTNRI(DSNAME,DETAIL,EVENT,PYRS,T,PROB1,PROB2)". The SAS macro about generating surv1 and surv2 that are the probability of the Cox models. However, I still cannot run the code. It seems they create a matrix to store variables and then use Cox PH model? but I cannot fully understand all the code. Could anyone help me figure it out?
@SeaMoon_168 wrote:
Thank you for your help. I made some changes from another expert. A new SAS macro was added named "%macro SURV_CONTNRI(DSNAME,DETAIL,EVENT,PYRS,T,PROB1,PROB2)". The SAS macro about generating surv1 and surv2 that are the probability of the Cox models. However, I still cannot run the code. It seems they create a matrix to store variables and then use Cox PH model? but I cannot fully understand all the code. Could anyone help me figure it out?
If the output of that Surv_contnri is supposed to be the input for the net reclassification then you need to make sure that Surv_contnri runs properly and the log you posted shows that either your call to the macro using variables not in the data set is incorrect or the input data set is incorrect for that macro. Until you get that running properly, again assuming it is supposed to create the input for the other macro, that is your bottleneck.
@SeaMoon_168 wrote:
(...) However, I still cannot run the code.
Hello @SeaMoon_168,
I think it will help if we are using the same test data. Using the Myeloma dataset from Example 92.1 of the PROC PHREG documentation I obtained seemingly reasonable results and a clean log with the code below including calls of the two macros contained in your attachment NRI.sas:
/* Create subject ID */
data one;
set Myeloma;
_id_=_n_;
run;
/* Set (arbitrary) reference values for predictor variables */
proc summary data=one;
var LogBUN--SCalc;
output out=ref(drop=_:) min=;
run;
/* Create the "dummy observation" as suggested in NRI.sas and set the time point of interest */
data two;
_id_=0;
VStatus=.;
Time=10;
set ref;
run;
/* Create input dataset for macro PREDMAC */
data have;
set two one;
run;
/* Call macro PREDMAC with two different Cox models and save the predicted event probabilities */
%PREDMAC(have, _id_, VStatus, Death, Time, Platelet Frac, HGB Platelet Age LogWBC Frac LogPBM Protein SCalc, Base model, probt1)
data mod1;
set preddat;
run;
%PREDMAC(have, _id_, VStatus, Death, Time, Platelet Frac, LogBUN HGB Platelet Age LogWBC Frac LogPBM Protein SCalc, Extended model, probt2)
data mod2;
set preddat;
run;
/* Create input dataset for macro SURV_CONTNRI */
data combo;
merge mod1(keep=_id_ VStatus Time probt1)
mod2(keep=_id_ probt2);
by _id_;
run;
/* Estimate NRI among other statistics */
%SURV_CONTNRI(combo,2,VStatus,Time,10,probt1,probt2)
/* Result: NRI=0.43560597682792335 */
Many thanks. Your code worked well with the Myeloma data. However, it did not work using my own data. I attached my data and simply compare base model: DM1 and extended model: DM1 DM2. Could you help me figure it out?
@SeaMoon_168 wrote:
Many thanks. Your code worked well with the Myeloma data. However, it did not work using my own data. I attached my data and simply compare base model: DM1 and extended model: DM1 DM2. Could you help me figure it out?
You're welcome. Not sure what exactly "did not work" on your side without seeing your code, log or output. But here is the code that does work with your sample data (which I read into a work dataset SAMPLE from a CSV file created from your Excel file on my private computer, as there is no Excel on my SAS workstation) with the changes highlighted in bold:
/* Create subject ID */ data one; set sample; _id_=_n_; run; /* Set (arbitrary) reference values for predictor variables */ proc summary data=one; var DM:; output out=ref(drop=_:) min=; run; /* Create the "dummy observation" as suggested in NRI.sas and set the time point of interest */ data two; _id_=0; VStatus=.; Time=2.5; set ref; run; /* Create input dataset for macro PREDMAC */ data have; set two one; run; /* Call macro PREDMAC with two different Cox models and save the predicted event probabilities */ %PREDMAC(have, _id_, VStatus, Death, Time, DM1, DM1, Base model, probt1) data mod1; set preddat; run; %PREDMAC(have, _id_, VStatus, Death, Time, DM1 DM2, DM1 DM2, Extended model, probt2) data mod2; set preddat; run; /* Create input dataset for macro SURV_CONTNRI */ data combo; merge mod1(keep=_id_ VStatus Time probt1) mod2(keep=_id_ probt2); by _id_; run; /* Estimate NRI among other statistics */ %SURV_CONTNRI(combo,2,VStatus,Time,2.5,probt1,probt2)
The main result is nri=0.30344... It depends, however, on whether variable DM1 (with values 0, 1, 2, 3) is treated as a CLASS variable by PROC PHREG (as above). If it is treated as a continuous variable, I obtain nri=0.26082... with the PREDMAC macro calls changed to
%PREDMAC(have, _id_, VStatus, Death, Time, , DM1, Base model, probt1) ... %PREDMAC(have, _id_, VStatus, Death, Time, DM2, DM1 DM2, Extended model, probt2)
As expected, treating DM2 (with values 0, 1) as a continuous variable (i.e., leaving the CLASSVARS parameter of macro PREDMAC blank also in the second call) makes no difference.
Many thanks! The code worked for my data now. However, the results using the SAS code is not the same as the results using the R package-survIDINRI :-(. I have no idea if the authors of these two used the same method to calculate NRI?
@SeaMoon_168 wrote:
(...) However, the results using the SAS code is not the same as the results using the R package-survIDINRI :-(. I have no idea if the authors of these two used the same method to calculate NRI?
The documentation of the survIDINRI package contains a reference Pencina et al.(2011) and states that the "Result of continuous-NRI", denoted m2, "corresponds to the quantity defined as '1/2 NRI(>0)'" in that article. (Without delving into the R code it is not quite clear to me whether this means m2=1/2 NRI(>0) or rather m2=NRI(>0).) The definition of that NRI(>0), in turn, is consistent with the algorithm used in the SAS macros from Harvard, as far as I see, -- except that the survival probabilities in the article are "all estimated using the Kaplan-Meier approach" (Pencina et al.(2011)), whereas current SAS/STAT versions use METHOD=BRESLOW by default.
So, other than the potential trivial discrepancy due to the factor 1/2 mentioned above, you should check if adding the option METHOD=PL to the OUTPUT statements of the two PROC PHREG steps in macro SURV_CONTNRI helps to match the result from R. That is:
proc phreg data=_all_(where=(disc=1 or disc=-1)) noprint; model pyrs*event(0) = ; id _id_; strata disc; output out=survdat survival=_survest_ / method=pl; run; proc phreg data=_all_(where=(bd ne 1)) noprint; model pyrs*event(0) = ; id _id_; output out=survdat0 survival=survall / method=pl; run;
Also, are you aware that the NRI statistic is rather questionable (according to what I've read in the past few days)? Even the Wikipedia page Net reclassification improvement says, "NRI has been demonstrated to be invalid."
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.