Kaplan-Meier Survival Plotting Macro %NEWSURV
- Article History
- RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
Hello everyone,
Upon learning that sascommunity.org is being decommissioned I decided to move my wiki page to this website as an article to help me continue to share the code. This macro has been around for years and has been constantly evolving to be my pocketknife of survival analysis whether it's graphing or generating tables. I have presented this macro at four separate conferences and have been freely sharing the macro with anyone who is interested.
Abstract
The research areas of pharmaceuticals and oncology clinical trials greatly depend on time-to-event endpoints such as overall survival and progression-free survival. One of the best graphical displays of these analyses is the Kaplan-Meier curve, which can be simple to generate with the LIFETEST procedure but difficult to customize. Journal articles generally prefer that statistics such as median time-to-event, number of patients, and time-point event-free rate estimates be displayed within the graphic itself, and this was previously difficult to do without an external program such as Microsoft Excel. The macro NEWSURV takes advantage of the Graph Template Language (GTL) that was added with the SG graphics engine to create this level of customizability without the need for backend manipulation. Taking this one step further, the macro was improved to be able to generate a lattice of multiple unique Kaplan-Meier curves for side by side comparisons or condensing figures for publications. The following is a paper describing the functionality of the macro and a description of how the key elements of the macro work.
What does the macro do?
Generates a graph
The original purpose of creating the macro was to make a journal quality Kaplan-Meier (KM) curve that included common survival statistics within the curve itself so that I did not have to manually add them to an image post-hoc. Below is an example using the dataset SASHELP.BMT:
The macro allows for a vast amount of optional customization to fit the user's needs, but also allows for simple macro calls to get the ball rolling.
Generates a table
I added the ability for the macro to take the survival statistics it was calculating and organize them into a clean summary table using the REPORT procedure.
Automatically calculates survival statistics
The macro automatically computes many commonly used survival statistics including the following:
- Kaplan-Meier
- Number of patients/events (PROC LIFETEST)
- Median Time-to-Event w/95% Confidence Bounds (PROC LIFETEST)
- Time-Point Event-Free Rates w/95% Confidence Bounds (PROC LIFETEST)
- Unstratified/Stratified Logrank/Wilcoxon P-value (PROC LIFETEST)
- Cumulative Incidence Analyses:
- The methods used to compute this differ depending on SAS version.
- Prior to SAS 9.4M3:
- The code used to compute cumulative incidence was taken from the SAS autocall macro %CIF. This macro uses PROC IML to do most of the calculations, but since this is an optional add-on for some companies, the DATA STEP with array processing was used as a substitute. In order to calculate the p-value the matrix operations within PROC FCMP was used.
- SAS 9.4M3+: Proc LIFETEST
- Prior to SAS 9.4M3:
- Number of patients/events
- Median Time-to-Event w/95% Confidence Bounds
- Time-Point Event-Free Rates w/95% Confidence Bounds
- Unstratified/Stratified Gray K-sample P-value
- Patients-at-Risk CountsPatients-at-Risk Counts
- The methods used to compute this differ depending on SAS version.
- Cox Proportional Hazards Regression
- Hazards Ratio w/95% Confidence Bounds (PROC PHREG)
- Wald Parameter P-values (PROC PHREG)
- Wald/Score/Likelihood-Ratio Type 3 Test P-values (PROC PHREG)
- Competing Risks (PROC PHREG - SAS 9.4+ Only)
- Concordance index
- This is calculated in data steps using the same code used in the survConcordance R package
There are parameters for customizing the automated analysis, including:
- Stratification
- Categorical and continuous adjusting factors for hazard ratios
- Class variable options
- Reference value
- Order of values
- Cumulative incidence variance calculations
- Ability to subset with a WHERE clause without making a new data set
- Ability to indicate censor or event value
- Method for ties within Cox models
More plot annotations available
- Confidence bounds as lines, color bands, or both
- Reference lines that drop to the x-axis and/or y-axis
- Can show either the median time-to-events or event-free rates
Why use this macro?
There are multitudes of survival based macros out there in the wild. Why should you choose to try this one instead of one of those or an internal macro?
The macro's middle name is customizability
Nearly all parts of the graph are customizable with multiple options.
Step Plot Curves and Confidence Bounds
- Thickness, color, and pattern
- Thickness, color, pattern, fill color, and transparency of band plot
Censor Indicating Scatter Plot
- Size and color
Patients-at-Risk Block Plot
- Size and weight of font
- Location of block plot (Above x-axis or below x-axis)
- Location of block plot labels (Left of block plot, above block plot, none)
- Location and text of header
- Match color to step plot curves
Display Statistics
- Text size and weight
- Statistics column header text
- Which statistics are displayed and their order
- Number of patients and number of events
- Number of patients and number of events separately ###
- Number of events/number of patients ###/###
- Median time-to-event w/95% confidence bounds ###.# (###.#-###.#)
- Hazard ratios w/95% confidence bounds ###.# (###.#-###.#)
- Time-point event-free rates
- Column for time-points (e.g. 60 days, 4 months, etc) which can be disabled
- Column for event-free rates w/95% confidence bounds ###.# (###.#-###.#)
- P-values
- P-value format (#.####), values less than 0.0001 shown as >0.0001
- Column for covariate level Wald p-values
- Section for type 3 p-value of the class variable
- Concordance index w/95% confidence bounds #.## (#.##-#.##)
- Manually entered comments are available with the TABLECOMMENTS parameter
Titles/Footnotes
- Text size and weight
- Overall titles/footnotes for the image as well as individual titles/footnotes for each plot
- Superscripts/subscripts/Unicode available
Axes
- Y-axis can be set to proportions or percentages
- X-axis can be transformed into other time units (days to months, etc.)
- Automatic or manual labels
- Automatic labels will be "Percent/Proportion with Event" for Y-axis and the time variable's label for the X-axis
- Manual labels can be entered to override automatic labels
- Label and tick value text size and weight
- Top and right frames can be disabled
- Axis colors
Image options
- Can output any type of image
- Includes special programming for TIFF files to be smaller file size
- Can output or be included within nearly any ODS tags (RTF, EXCEL, PDF, HTML, POWERPOINT)
- Height, width, dpi, anti aliasing
- Axis colors and font colors
- Background colors and background transparency
The macro runs clean
- The NEWSURV macro is designed to run completely internally and will not change your options, does not have any global macro variables, and cleans up any temporary data sets that it makes
- Includes thorough error checking that not only is designed to stop the macro before it crashes, but gives the user tips on what went wrong
- For example if a macro option is used incorrectly (e.g. a variable doesn't exist) the macro error text will tell this to the user
- If a parameter with multiple options is used incorrectly (invalid value), the macro will tell the user this and list the valid values
- The results, log, and notes are turned off while the macro is running to keep the log, listing and results windows clean
The macro runs multiple models and can create multiple graphs in one image
- Multiple statistical models can be run with one macro call for the statistical table
- Multiple graphs can be created within one image
- Each graph can have different options including time variables, class variables, and methods
- Easy to create a paneled image for manuscripts
The macro is mostly backwards compatible back to SAS 9.2
The macro was written in SAS 9.2 and most of the options and techniques are still compatible with the older versions (9.2/9.3) of SAS. These features include creating multiple graphs, calculating the concordance index, and running cumulative incidence models.
Thoroughly documented
The beginning of a macro has extensive documentation on each parameter including valid values and how to properly use each one.
Examples of Key Parameters
Below are the examples that were included in the sascommunity.org webpage.
Dataset for Examples
Make the dataset with the following code (also in macro documentation)
proc format;
value grpLabel 1='ALL' 2='AML low risk' 3='AML high risk';
run;
data BMT;
input DIAGNOSIS Ftime Status Gender@@;
label Ftime="Days";
format Diagnosis grpLabel.;
datalines;
1 2081 0 1 1 1602 0 1
1 1496 0 1 1 1462 0 0
1 1433 0 1 1 1377 0 1
1 1330 0 1 1 996 0 1
1 226 0 0 1 1199 0 1
1 1111 0 1 1 530 0 1
1 1182 0 0 1 1167 0 0
1 418 2 1 1 383 1 1
1 276 2 0 1 104 1 1
1 609 1 1 1 172 2 0
1 487 2 1 1 662 1 1
1 194 2 0 1 230 1 0
1 526 2 1 1 122 2 1
1 129 1 0 1 74 1 1
1 122 1 0 1 86 2 1
1 466 2 1 1 192 1 1
1 109 1 1 1 55 1 0
1 1 2 1 1 107 2 1
1 110 1 0 1 332 2 1
2 2569 0 1 2 2506 0 1
2 2409 0 1 2 2218 0 1
2 1857 0 0 2 1829 0 1
2 1562 0 1 2 1470 0 1
2 1363 0 1 2 1030 0 0
2 860 0 0 2 1258 0 0
2 2246 0 0 2 1870 0 0
2 1799 0 1 2 1709 0 0
2 1674 0 1 2 1568 0 1
2 1527 0 0 2 1324 0 1
2 957 0 1 2 932 0 0
2 847 0 1 2 848 0 1
2 1850 0 0 2 1843 0 0
2 1535 0 0 2 1447 0 0
2 1384 0 0 2 414 2 1
2 2204 2 0 2 1063 2 1
2 481 2 1 2 105 2 1
2 641 2 1 2 390 2 1
2 288 2 1 2 421 1 1
2 79 2 0 2 748 1 1
2 486 1 0 2 48 2 0
2 272 1 0 2 1074 2 1
2 381 1 0 2 10 2 1
2 53 2 0 2 80 2 0
2 35 2 0 2 248 1 1
2 704 2 0 2 211 1 1
2 219 1 1 2 606 1 1
3 2640 0 1 3 2430 0 1
3 2252 0 1 3 2140 0 1
3 2133 0 0 3 1238 0 1
3 1631 0 1 3 2024 0 0
3 1345 0 1 3 1136 0 1
3 845 0 0 3 422 1 0
3 162 2 1 3 84 1 0
3 100 1 1 3 2 2 1
3 47 1 1 3 242 1 1
3 456 1 1 3 268 1 0
3 318 2 0 3 32 1 1
3 467 1 0 3 47 1 1
3 390 1 1 3 183 2 0
3 105 2 1 3 115 1 0
3 164 2 0 3 93 1 0
3 120 1 0 3 80 2 1
3 677 2 1 3 64 1 0
3 168 2 0 3 74 2 0
3 16 2 0 3 157 1 0
3 625 1 0 3 48 1 0
3 273 1 1 3 63 2 1
3 76 1 1 3 113 1 0
3 363 2 1
;
run;
Four Variables:
- FTIME: Survival time
- STATUS: Survival status (0=Alive, 1=Death, 2=Other Failure)
- DIAGNOSIS: Type of disease (1='ALL' 2='AML low risk' 3='AML high risk')
- Gender: Patient's gender (0=Female, 1=Male)
Example 1: Basic Macro Call
%newsurv(DATA=bmt, TIME=FTIME, CENS=STATUS, CEN_VL=0, SUMMARY=0);
This is basic macro call using almost only required parameters. The graph is plotted with censor markers and confidence bounds by default, and displays number of patients, number of events, and median time to event within the graph. The following parameters are introduced:
- DATA: The dataset to be used by the macro. This dataset will not be modified by the macro.
- TIME: The variable within DATA that contains the patients time-to-event values. Must be numeric.
- CENS: The variable within DATA that contains the patients event status. Must be numeric.
- CEN_VL: The value of the CENS varaible that contains the censor value. Must be a numeric value.
- SUMMARY: This parameter determines if the table summary is displayed along with the plot. 1 is Yes and 0 is No.
Example 2: Plotting Multiple Curves with a CLASS Parameter
%newsurv(DATA=bmt, TIME=FTIME, CENS=STATUS, CEN_VL=0, SUMMARY=0, CLASS=DIAGNOSIS, CLASSREF=ALL,CLASSORDER=1 3 2);
This example shows how to add a CLASS parameter to produce grouped survival curves. Doing so also adds hazard ratios and a p-value to the plot statistical summary table. The example shows how to use other parameters related to the CLASS variable. The following new parameters are introduced:
- CLASS: Variable used to group the plots. Can be character or numeric.
- CLASSREF: Level of the CLASS variable that is to be used as the reference for the hazard ratios. The value specified must be one of the formatted values of the CLASS variable. If not specified then the last value alphabetically is used.
- CLASSORDER: Reorders the CLASS levels within the plot. The values are sorted alphabetically by formatted values by default. They are reordered by entering a numbered list. In this example 1 3 2 is used which specifies the first, then third, then second alphabetical levels. Ordering by 1 2 3 would match the default.
Example 3: Modifying the Axes and Curves
%newsurv(DATA=bmt, TIME=FTIME, CENS=STATUS, CEN_VL=0, SUMMARY=0, CLASS=DIAGNOSIS, CLASSREF=ALL,CLASSORDER=1 3 2, COLOR=black red green, PATTERN=solid, LINESIZE=3pt, SYMBOLSIZE=10pt, XDIVISOR=30.44, XMAX=100, XINCREMENT=10, XLABEL=Months, YTYPE=PPT, YLABEL=Proportion Alive);
This example shows how to change the X and Y axis scale and labels as well as the attributes of the Kaplan-Meier curves. The x-axis is changed from days to months (this does not affect the original dataset) and the tick values are determined by XMAX and XINCREMENT. The y-axis is changed to proportion using the YTYPE parameter. The Kaplan-Meier curves can have their colors, patterns, thickness, and censor symbols modified. The following new parameters are all introduced:
- COLOR: Provides a list of colors that will be applied to the lines in the order of the CLASS variable. If only one color is provided then that color is used for all lines. Default is black.
- PATTERN: Provides a list of patterns that will be applied to the lines in the order of the CLASS variable. If only one pattern is provided then that pattern is used for all lines. Numbers between 1 and 42 can be used as well as certain keywords like SOLID. The default is AUTO, which makes all lines solid when colors are specified and different patterns when only one color is provided.
- LINESIZE: Provides the thickness of the graph lines. Must be a number followed by pt. Default is 1pt.
- SYMBOLSIZE: Provides the size of the censor symbols. Must be a number followed by pt. Default is 3pt.
- XDIVISOR: Converts the time variable to other units by dividing by a scalar value. Does not affect the original dataset.
- XMAX: Determines the maximum value of the x-axis based on final converted units. Computed automatically by default.
- XINCREMENT: Determines the tick value increment of the x-axis of final converted units. Computed automatically by default.
- XLABEL: Determines the label of the X-axis. Default is the time variable label.
- YTYPE: Determines if the Y-axis is in percentages or proportions. Default is PPT (Percent).
- YLABEL: Determines the label of the Y-axis. Default is either Percentage with Event or Proportion with Event.
Example 4: Modifying the Statistics Table and Showing Event-free Rates
%newsurv(DATA=bmt, TIME=FTIME, CENS=STATUS, CEN_VL=0, SUMMARY=0, CLASS=DIAGNOSIS, CLASSREF=ALL,CLASSORDER=1 3 2, COLOR=black red green, PATTERN=solid, LINESIZE=3pt, SYMBOLSIZE=10pt, XDIVISOR=30.44, XMAX=100, XINCREMENT=10, XLABEL=Months, YTYPE=PPT, YLABEL=Proportion Alive, TIMELIST=20 40,TIMEDX=months,DISPLAY=legend timelist);
This example demonstrates changing which statistics are shown within the plot table and how to display Kaplan-Meier time-point event-free rates. One or more event-free rates can be specified, and they will be displayed vertically. The column that shows the time-point can be disabled when displaying only one time-point. The following new parameters are all introduced:
- DISPLAY: Controls which statistics are displayed within the plot summary table. Items are displayed in the same order that they are listed within the macro variable. The default changes depending on what kind of plot is being displayed.
- TIMELIST: Enter list of one or more time-points in the final X-axis units to calculate time-point event-free rates. Can be a simple list of numbers (xx xx xx xx) or a list in loop format (xx to xx by xx).
- TIMEDX: This parameter will be used to display the units within the time-point estimates. For example, entering Months will add the text Months after each time-point number within the plot.
Example 5: Adding Patients-at-Risk
%newsurv(DATA=bmt, TIME=FTIME, CENS=STATUS, CEN_VL=0, SUMMARY=0, CLASS=DIAGNOSIS, CLASSREF=ALL,CLASSORDER=1 3 2, COLOR=black red green, PATTERN=solid, LINESIZE=3pt, SYMBOLSIZE=10pt, XDIVISOR=30.44, XMAX=100, XINCREMENT=10, XLABEL=Months, YTYPE=PPT, YLABEL=Proportion Alive, RISKLIST=0 to 100 by 10,RISKLOCATION=BOTTOM,RISKCOLOR=1);
This example gives a basic demonstration of adding the patients-at-risk counts to the bottom of the plot. There are many options to customize the location of the labels, headers, where the table is printed, and even what type of numbers are shown. The following new parameters are all introduced:
- RISKLIST: Enter list of one or more time-points in the final X-axis units to display the current patients-at-risk. Can be a simple list of numbers (xx xx xx) or a list in loop format (xx to xx by xx). Does not have to be the same as TIMELIST and does not have to match X-axis tick marks.
- RISKLOCATION: Determines where the patients-at-risk will be drawn. Default is BOTTOM, which displays the numbers below the X-axis. Specifying INSIDE would put the numbers under the curve above the X-axis.
- RISKCOLOR: Determines if the patients-at-risk numbers are colored to match the Kaplan-Meier curves. Default is 0 (no). Specifying 1 will color the numbers, and can potentially make matching the numbers to the curves visually easier.
Other useful options:
- PARHEADER: Determines the header used above the patients-at-risk table. If left blank then will not be drawn.
- PARHEADERALIGN: Determines where the header will be drawn. Options are LEFT, CENTER, RIGHT, and LABELS. Specifying LABELS will place the header above the labels to the left of the numbers.
- RISKLABELLOCATION: Determines where the labels for the numbers are drawn. Options are LEFT, ABOVE and null. Specifying LEFT draws the labels to the left. Specifying ABOVE draws the labels above the numbers in their own row which is useful for long labels. Specifying nothing will cause the labels to not be drawn. This can be useful when pairing with RISKCOLOR.
- PARDISPLAY (New): Determines what numbers are shown in the patients-at-risk table. One or more items can be listed. Options are PAR (Patients-at-risk), NCENS (Number of cumulative censors), and NEVENTS (Number of cumulative events). Two combinations, PAR_NCENS and PAR_NEVENTS are also allowed.
Example 6: Cumulative Incidence
%newsurv(DATA=bmt, TIME=FTIME, CENS=STATUS, CEN_VL=0, SUMMARY=0, CLASS=DIAGNOSIS, CLASSREF=ALL,CLASSORDER=1 3 2, COLOR=black red green, PATTERN=solid, LINESIZE=3pt, SYMBOLSIZE=10pt, METHOD=CIF, EV_VL=1);
This example shows how to plot cumulative incidence instead of Kaplan-Meier curves. The following new parameters are all introduced:
- METHOD: Determines which method is used to generate the curves. Options are KM (Kaplan-Meier) or CIF (Cumulative Incidence Function).
- EV_VL: Determines which value of the status variable is the event of interest. Non-event and non-censor values are considered other events.
Example 7: Multiple Plots
%newsurv(DATA=bmt, TIME=FTIME, CENS=STATUS, CEN_VL=0, SUMMARY=0, CLASS=DIAGNOSIS, CLASSREF=ALL,CLASSORDER=1 3 2, COLOR=black red green, PATTERN=solid, LINESIZE=3pt, SYMBOLSIZE=10pt, METHOD=KM|CIF, EV_VL=1, SREVERSE=1|0, NMODELS=2, ROWS=2, AUTOALIGN=BOTTOMRIGHT|TOPRIGHT);
This example shows how to produce multiple plots in a lattice diagram. Any options that will be different between plots have the | (capital \) delimiter to designate different settings per option. Any options without a | delimiter will keep the same settings across all models. This example also demonstrates the difference when plotting CIF versus 1-Survival. The following new parameters are all introduced:
- SREVERSE: Determines if Survival or 1-Survival is plotted. 1-Survival is not the same as CIF. Default is 0 (Survival) and 1 indicates 1-Survival.
- NMODELS: Determines how many models will be run. Default is 1.
- ROWS: Determines how many rows will be in the graph lattice. Default is 1.
- AUTOALIGN: Determines where the statistical summary table is shown within the plot. Default is TOPRIGHT. Can be anchored to any of the 9 primary points of the plot (TOPRIGHT, TOP, TOPLEFT, LEFT, CENTER, RIGHT, BOTTOMRIGHT
Example 8: Reference Lines
%newsurv(DATA=bmt, TIME=FTIME, CENS=STATUS, CEN_VL=0, SUMMARY=0, CLASS=DIAGNOSIS, CLASSREF=ALL,CLASSORDER=1 3 2, COLOR=black red green, PATTERN=solid, LINESIZE=3pt, SYMBOLSIZE=10pt, REFLINES=medians,REFLINEAXIS=both);
This example shows how to add reference lines. Reference lines can highlight two different items: medians and time-point estimates. The reference lines can be dropped to either axis. The following new parameters are all introduced:
- REFLINES: Determines which statistic is used for the reference lines. Either MEDIANS or TIMEPOINTS are allowed. If TIMEPOINTS is specified then all time-points in TIMELIST are shown.
- REFLINEAXIS: Determines which axis the reference lines are drawn to. Options are X, Y or Both.
Other options:
- REFLINEMETHOD: Determines if reference lines are drawn from the KM curves or across the whole plot. Default is DROP. Options are DROP and FULL.
Example 9: Confidence Intervals
%newsurv(DATA=bmt, TIME=FTIME, CENS=STATUS, CEN_VL=0, SUMMARY=0, CLASS=DIAGNOSIS, CLASSREF=ALL,CLASSORDER=1 3 2, COLOR=black red green, PATTERN=solid, LINESIZE=3pt, SYMBOLSIZE=10pt, PLOTCI=1);
This example shows how to add confidence intervals. Confidence intervals are automatically added for graphs with no CLASS variable. Confidence intervals can be added as a filled background, as lines, or both. By default only a filled background is used similar to the LIFETEST procedure. The following new parameters are all introduced:
- PLOTCI: Determines if confidence intervals are drawn. Default is 2, which makes confidence intervals if no CLASS variable is provided, but no confidence intervals if a CLASS variable is provided. Options are 1 to force confidence interval and 0 to disable confidence interval.
Expansion Macros
Adjusted Survival
Multivariate models are often necessary in survival analysis in order to account for confounding factors. Adjusting for other factors can dramatically change outcomes such as hazard ratio. When outcomes are dramatically changed in a multivariate model it can be inappropriate to plot the unadjusted curves. There are numerous methods available for creating adjusted survival curves, and none are the correct method for all situations. Thus there was a need to have a series of macros that could create the high quality plots of NEWSURV, but with the appropriate methodologies for adjusted survival curves. These macros were designed to adjust survival curves using either the direct adjustment or inverse weights methodologies. A third macro, NEWSURV_DATA, allows the user to pre-calculate their own survival curves and then plot them with the customization of NEWSURV.
NEWSURV_ADJ_DIRECT
The NEWSURV_ADJ_DIRECT macro calculates the adjusted survival curves based off of the predicted survival curves created by the PHREG procedure. This is described in more detail in the PharmaSUG 2017 paper.
NEWSURV_ADJ_INVWTS
The NEWSURV_ADJ_INVWTS macro calculates weights from the LOGISTIC procedure that it then supplies to the LIFETEST procedure WEIGHTS statement.This is described in more detail in the PharmaSUG 2017 paper.
NEWSURV_DATA
The NEWSURV_DATA macro allows the user to specify their own dataset with time and survival variables that have been previously calculated in order to produce a highly customizable journal quality image. This allows the user to use their own adjustment or calculation method not available in the NEWSURV macros.
Example 10: Adjusted Survival Curves
%newsurv_adj_invwts(DATA=bmt, TIME=FTIME, CENS=STATUS, CEN_VL=0, SUMMARY=0, CLASS=DIAGNOSIS, CLASSREF=ALL,CLASSORDER=1 3 2, COLOR=black red green,LINESIZE=3pt, SYMBOLSIZE=10pt, CLASSCOV=gender);
This example shows how to make adjusted survival curves using the inverse weights methods. The macro parameters are mainly similar to NEWSURV. The following new parameters are all introduced:
- CLASSCOV: Specifies discrete covariates to adjust the survival curves, hazard ratios, and p-values by. This is also available in NEWSURV, but will not adjust the actual curves.
Other options:
- CONTCOV: Specifies continuous covariates to adjust the survival curves, hazard ratios and p-values by. This is also available in NEWSURV, but will not adjust the actual curves.
- PLOT_UNADJUST: Determines if the unadjusted survival curves are drawn. Default is 1 (Yes).
Macros and options provided by SAS
Editor's note: we added this mention of SAS-provided macros/options for completeness.
In addition to the thorough options provided here and documented in @JeffMeyers papers, SAS also provides guidance and code for customizing the Kaplan Meier survival plot.
- See the special chapter in the SAS/STAT User's Guide.
- Download the source code for SAS macros
Note that these options/macros work only with the more recent versions of SAS.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
This is excellent @JeffMeyers! We know that this article is among the most popular on sasCommunity.org, and we actually had drafted a version of this for you here (as we've done for some others). But your article is so deep and thorough, we had not yet completed the quality-check we felt necessary to publish it. Thank you for managing that for us!
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Thank you @ChrisHemedinger. I saw that on the list of unpublished articles so I was hoping there wouldn't be any issues posting this. Let me know if you have any suggestions on the formatting or anything for this article as I'm still new at making these. I'm planning to work on my other big article and potentially new programs to share as well.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
This looks really useful, but I cannot download any of the files.
It just says "virus scan in progress...." beneath each file, but nothing more is happening.
Am I doing something wrong?
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hi @ghbg,
Sorry for the inconvenience. There is a bug preventing attachments from working properly. We hope to have this resolved soon.
Best,
Shelley
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
This is incredibly useful - thank you Jeff for making it available.
I am having difficulty with one small thing: I cannot get the image file to save. I have followed the instructions in the macro, and specified the following:
GPATH=F:\Folder\Folder\Folder\My folder\
PLOTTYPE=png
PLOTNAME=test
But nothing appears in the folder that I specify. Is there another part that I am missing??
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
works like a dream - thanks very much
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hi, this is a really useful macro!
I was wondering if it is possible to display the number of events alongside the number at risk, rather than the number of cumulative events?
Thanks!
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hi, thank you very much for this helpful macro!
We do have a question concerning the use of the macro.
With every run of the macro (used to plot a Kaplan-Meier-curve) the output window is resetted.
Is it possible to prevent the macro from resseting the output window?
We do have to run Kaplan-Meier (KM) for different outcome variables and it would be very helpful to have all KM curves in one output window.
Thanks!
sasstats
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hello Jeff,
thanks for your answer.
We are using SAS 9.4 on PCs.
Our preferred output destinations are HTML, excel or rtf.
Changing the plot name wouldn't really solve our problem.
Is there any option that enables the macro to append new plots instead of overwriting?
Our aim is to get several different Kaplan-Meier curves (for different groups) in one Excel sheet.
Normally, without using the Macro %newsurv SAS appends every new plot or table to an existing output-window or rtf-File.
But the macro %newsurv overwrites an existing file as you said above.
sasstats
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hello @sasstats,
Could you link the code you're trying to use? Normally when you wrap ODS tags around the macro it'll just append it to the report you have.
Example:
ODS RTF file='test.rtf';
%newsurv(...)
%newsurv(...)
ods rtf close;
However if you have OUTDOC specified in the macro call it will only write to that single file. So if you're trying to append the graph to a report within ODS tags make sure that OUTDOC is blank. Otherwise if you link the code you're trying to use (can use ... for your other report items) I can test it out on our Windows server to see if there's a glitch there as we normally use Linux.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hello Jeff,
it works fine!
Thank you very much!!!
This is a great help for us!
sasstats
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hello Jeff @JeffMeyers ,
I am getting the error "ERROR: Subquery evaluated to more than one row." when using the %newsurv_adj_direct macro. Even when I use the example data. Do you know why this is happening?
I am using SAS Online studio.
Thanks
Ubai
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Can the y-axis be specified for a cumulative incidence curve. I want the maximum to be 30% (0.30).
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hi @JeffMeyers ,
Thanks for this wonderful macro.
I am getting an error when using PARHEADERALIGN:
ERROR: The keyword parameter PARHEADERALIGN was not defined with the macro.
Also, would it be possible to show the strata variable label in the graph?
Thank you
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hello everyone. I've posted a new version of the macro (newsurv_web_042020.sas) that has my most recent updates and also combines the adjusted macros into the original NEWSURV. Now you can specify method=DIRECT or method=INVWTS to do the adjusted methods. Just a note that I don't use Windows SAS so if there are any bugs found while using with Windows please e-mail me at meyers.jeffrey@mayo.edu to get the fastest response.
@Ubai I found that error was because Windows SAS handled missing values differently than Linux SAS. Please try my new version to see if you get the error still (see note in previous paragraph).
@greg_maislin You can set YMAX to be either 30 or 0.3 depending on which option you have set for YTYPE. Example:
%newsurv(data=sashelp.bmt,time=t,cens=status,method=cif,class=group,ev_vl=1,ymax=30)
@Alnajar The option you are looking for is PARALIGN, not PARHEADERALIGN I believe.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
@JeffMeyers Hi Jeff and thank you for the update. I really appreciate the effort you put into this macro. I use it all the time for my research.
I've just tried the macro in SAS online studio. I tried adjusting survival curves and the macro works flawlessly. I was wondering, is it possible to adjust CIF curves as well?
Removing the censor marker from the legend was a long waited change.
Also, what method does the macro use when calculating the adjusted HR for an outcome with a competing risk(s)? Can I manipulate the method?
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hello @Ubai. The latest version I posted can remove the censor markers from the legend by setting CENSORMARKERS=2. I honestly do not know if it is possible to adjust CIF curves. I personally am not a statistician but am a statistical programmer. I had Terry Therneau teach me about adjusted survival here but have not had any interactions with adjusted CIF.
The adjusted hazard ratios with competing risks are done withing the PHREG procedure using the EVENTCODE option in the MODELS statement.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
@JeffMeyers exactly, I have noticed the addition of censormarkers = 2 and I was complimenting it. 🙂🙂
as far as I know, there are methods to adjusted CIF curves. I found some literature on this topic. The first two articles are about SAS macros that runs this task. I don't think that it's technically so much different than adjusting KM curves.
However, if the CIF is adjustable, it might be more practical to separate split the macro variable METHOD into variable METHOD with values (KM,CIF) as it was before and a new variable for example ADJ_Method (NONE,DIRECT, INVWTS) with default = NONE.
References:
Direct adjusted survival and cumulative incidence curves for observational studies
Keep up the good work 😉
Ubai
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
This is a fantastically useful macro, thank you. I am having one small problem though:
I want to increase the size of the font in my graphic, for the axes, labels and patients-at-risk box. I have found the elements that do this in the instructions (XTICKVALSIZE, LABELSIZE, PARSIZE, etc). The problem is that when I increase the font for the patients-at-risk box above about 10pt, the labels sort of overlap each other and the numbers on each row squash together. Hopefully the picture below shows what I mean. I can't figure out how to space them out. Aside from this, it looks great.
Can anyone help?
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hello @ghbg. Use the parameter RISKROWWEIGHTS to allocate more space to the patients at risk numbers. The number represents how much graph space each row gets as a proportion. Try setting it to 0.045 and increase or decrease from there.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
thanks. works perfectly
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hi @JeffMeyers . Sorry to post again but I have another query...
I am using a large dataset with ~26,000 records in one exposure group. I need to make a black and white graph, so I am using the following parameters in the NEWSURV macro:
COLOR=black,
PATTERN= SOLID SHORTDASH,
When I run this, I get an error message saying:
The STEPPLOT is not drawn because the limit for maximum number of observations per patterned line has been exceeded. You can set line pattern to SOLID or set LINEPATTERNOBSMAX=67200 in the ODS GRAPHICS statement to enable the plot. Increasing this limit could make Java VM run out of memory. See product documentation for related considerations.
I don't get this error when I use coloured lines (e.g. black blue). I have tried including LINEPATTERNOBSMAX=67200 in the code, as suggested, but it doesn't recognise this as a parameter.
Could you let me know if there is a way around this?
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hello @ghbg,
That error is not something I generate in the macro but from SAS itself. Basically it is saying that there are too many dashed objects in the dashed line being made I think. There isn't a parameter in my macro for this at the moment I think, but if you search through the macro code for the SGRENDER it runs, find the ODS GRAPHICS that comes just before it you can add that LINEPATTERNOBSMAX into that statement to adjust it on your end.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
works a treat. thanks again for the speedy response @JeffMeyers
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hi @JeffMeyers - I'm trying to output the plots in a vector scalable format. I am currently using SAS Enterprise Guide 7.15.
I added:
ODS LISTING;
GPATH = "mypathway..."
PLOTNAME = test1,
SVG = 1,
DESTINATION = PDF
ODS LISTING close;
Yet I dont get anything in my file path. Also, how would I know if the file successfully saved as a vector scalable file? I tried doing a ODS file file and it outputs a pdf file but it is a normal .pdf file (not sure if its vector scalable). Thanks for any help
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hello @bransivs
To output a scalable vector graphic in PDF format you need to use OUTDOC=filepath/filename.pdf. PDF is a document and not an image, but the graphics contained will be scalable vector graph. You will know if it worked by being able to select the text and zoom in without fuzziness. You can also skip this if you want to wrap the macro in your own ODS PDF tags.
Another option is to save as EMF (setting PLOTTYPE=EMF, SVG=1, gpath=wherever). I haven't tested this code in Enterprise guide so I'm not sure if it works the best. You could then insert into an Office product and do right click-> edit to make sure the file is scalable vector. One caveat is that this doesn't work with dashed lines.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Thanks Jeff - following up on the vector formatting^
Is there any other work around? The vector formatting mentioned above does not produce a vector image for it, regardless of what ODS PDF out statement I wrap around the macro
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hi @JeffMeyers
I am having difficulty getting the PAR subtitle to align in the same way as my risk labels in the PAR table...
PARHEADER=Number at risk,
PARALIGN=labels,
RISKLOCATION=bottom,
RISKLABELLOCATION=left
This gives the PAR table outside the plot, with risk labels to the left and PAR subtitle above the risk labels, but the subtitle is left-aligned whereas the risk labels are right-aligned. See image below.
The issue only occurs when I have longer risk labels, as here. Shorter labels seem to automatically left-align, along with the subtitle:
Any idea how I can get "number at risk" right-aligned like the longer risk labels ("donor with/without endocarditis")?
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hello @ghbg. The labels are always right aligned in that space. The reason they look left aligned in your shorter label example is that they're the same length with similar characters. They are still right aligned within their space with the PAR header starting from the left end of their space. The PAR header is created differently from these as it's drawn with annotation so that it can cross different graph spaces without getting cut off, so I can't set it to be right aligned like those labels are. It may be tedious, but the only real way around it would be to add some spaces to the front of the PAR header to bump it more to the right. E.G. PARHEADER=%str( Number at risk). This should work, but if not then you would need to use non-breaking space characters:
data _null_;
call symputx('nbs','A0'x);
run;
PARHEADER=&nbs.&nbs.&nbs.&nbs.Patients at risk
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
thanks very much. i thought it might need something like that.
can i also ask how i might left-align the risk labels instead? would it just be adding spaces after the shorter label?
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
@ghbg If you go into your version of the macro you have downloaded you can search in the graph template code for the ROW Headers section that make these. They are created with entry statements within a ROWHEADERS block, and you would just need to change the HALIGN option from RIGHT to LEFT. I don't have an option for it currently as the options are currently trying to remove as much white space from the left side of the plot as possible.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hi @JeffMeyers
I am having a problem with the title of my graph. The text is not wrapping, just continuing on one line and getting cut off at the edge of the plot. See below.
Can you suggest a solution?
thanks
George
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hello @ghbg, you can use the ` symbol (lowercase ~) to specify a line break in the title/ovtitle options.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
@JeffMeyers This macro very useful to me. Thanks. I am having the issue on Y having big label name, How Can I adjust or is there possibility that I can split into two lines, because its bigger than the height of the png set up in macro call.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Dear @JeffMeyers ,
I'm plotting multiple KM curves in one figure. I would like to leave the title for some plots blank. However when I do this the plot expands to the whole area originally planed for a plot without a title. I tried using the ` symbol, but it did not work.
Here is an example of what I am looking for. In this case I used a title and then deleted it manually using a picture editing software.
I also deleted the title for the Y axis. Is there a way to leave it empty? If it do not use a title the default title for the axis is used.
Any ideas?
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
There are two solutions:
Solution 1: You can adjust the line size of the Y axis label using LSIZE = 12 pt. Try different sizes for you title, 8 or 9 should work.
Solution 2: You can use the ` Symbol in your title to split it into two lines.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
@Ubai. Thanks your second solution worked. thanks
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hello @Ubai
there are two things you can try. The first is to set the title to be %str( ) (with a space). This makes a macro space character. The second is to make a non breaking space macro variable and use that.
Data _null_;
call symput('nbs','A0'x);
run;
then use &nbs in the title or label.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hello @JeffMeyers . Thank you for this amazing macro. I am newly using it. However I have some questions regarding how to control the headers and some other stuff mentioned in the image. Any suggestion in adjusting those. In headers controlling, I was able to do hazard ratio and P value but not the 'GRPVAR', which parameter controls the heading. Thank you. May be I am missing some Parameters in macro call.
%newsurv(DATA=f1,
TIME=aval,
CENS=cnsr,
CEN_VL=1,
SUMMARY=0,
height=4in,
WIDTH= 9in, /*color=red,*/
YLABEL= Probability of Reaching ,
CLASS=GRPVAR,
CLASSREF=,
CLASSORDER=2 1,
CLASSVALALIGN= LEFT,
COLOR= blue red,
PATTERN= 1 29,
LINESIZE=2pt,
SYMBOLSIZE=10pt,
CENSORMARKERS =1,
XDIVISOR=1,
XMAX=1800,
XINCREMENT=100,
XLABEL=Days,
XTICKVALSIZE=8,
YTYPE=PPT,
TIMELIST=463,
TIMEDX=months,
METHOD=KM,
PLOTTYPE=jpeg,
BORDER=0,
SHOWWALLS=1,
DISPLAY=legend HR COVPVAL , COVPVALHEADER= P-Value, HRHEADER= Hazard Ratio (95% CI), HRDIGITS=3,
PLOTPVAL= LOGRANK ,
STATCOLOR=1,
AUTOALIGN= TOP CENTER,
debug=1,
RISKLIST=0 to 1800 by 100,
RISKLOCATION=BOTTOM,
RISKCOLOR=1,
RISKROWWEIGHTS=.040,
PARHEADER= No of Subjects at Risk,
/* PARALIGN=LABELS,*/
RISKLABELLOCATION= left,
PARALIGN=LABELS,
/* PARDISPLAY=PAR_NEVENTS,*/
SREVERSE= 1,
/* GRIDLINES=1*/
REFLINES= TIMEPOINTS,
REFLINEAXIS= X,
REFLINEMETHOD= FULL );
ods rtf close;
ods listing close;
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hi Jeff, I was wondering how to change the logrank pvalue display from <.0001 to <0.0001? If that's possible. I tried playing around with pvaldigits parameter but it seemed to make no difference. Thanks very much.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hi @JeffMeyers
Sorry for another query. This macro is extremely useful and has helped enormously with my publications. Thank you.
I am making a lattice with 4 panels, each with a different KM curve. I need to save it as a png and as a pdf. The pdf looks perfect but the png image quality is very low, despite increasing the DPI. Changing the dpi doesn't seem to affect the png image quality (or the size of the file) at all - it just makes the pdf look quite speckled. I am using the following commands to make the PDF and PNG:
DPI=800,
PLOTNAME=my_graph,
GPATH=C:\my folder,
DESTINATION=pdf,
SVG=1,
OUTDOC=C:\my folder\my_graph.pdf
The pdf is great but I also need to create an image that I can include in a word document, and the PNG is no good. See attached close up
can you help at all?
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hello @ghbg ,
I tested this in my programming environment. I may need to do some code updating, but I found that I needed to turn on ODS LISTING prior to running the macro for the DPI to affect the final PNG image. This also only worked if I removed the outdoc, svg and destination options. I will look into the reasons for this and see if I can make some changes.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hello @KDS1 ,
The macro makes use of the PVALUE format which doesn't seem to like having the leading zero. I chose to use this format since it was readily available and so that the macro wasn't making a format it would later need to erase. If you have an internal picture format or regular format you would rather use you can search through the macro for the PVALDIGITS keyword in the macro text to replace the format next to it. It should look something like: strip(put(b.pval,pvalue%sysevalf(6.&&pvaldigits&z))) where the macro is applying the number of p-value digits to the PVALUE format.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
I am having a problem with the axis colours. They are coming out in black on the png file, but grey on the pdf, even though i have specified axiscolor=black
any suggestions?
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hi @JeffMeyers ,
I am looking for a macro that can handle pre-calculated weights in order to produce a weighted HR and weighted KM (i.e. weighted median and N at risk). The weights are stored in the original dataset as variable sw. In addition, I need to add option COVSANDWICH when estimating the 95%CI. I was wondering, if you have a version of the %NEWSURV macro that can do this.
PROC PHREG DATA=data COVSANDWICH;
weight sw;
CLASS arm (Ref=first);
MODEL OS*CNSR(1)=arm /rl=wald ties=efron;
run;
proc lifetest data=data;
time OS*CNSR(1);
strata arm;
weight sw;
run;
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hi @JeffMeyers
Thanks for all the help on this. Your suggestion for enabling the DPI by adding "ods listing" has worked.
I have another small problem. I don't know if it is in the macro or what I typing:
I cannot get a line break in the PVALHEADER text. If I use the ` delimiters, these just appear in the text, so including "PVALHEADER=` ` ` log rank tesp=" shows exactly this (without any line breaks).
Any advice?
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Hello @ghbg
I don't think the pvalheader is really set to have line breaks since it's usually in one row. If you would like to add some line breaks I would suggest adding something to TABLECOMMENTS and then listing tablecomments prior to pval in the DISPLAY option.
Example:
tablecomments=%str( )`%str( )`%str( ), display=...other stuff...tablecomments pval,