Demographic Table and Subgroup Summary Macro %TABLEN

15 Likes

This was a paper that was supposed to be presented at PharmaSUG 2020 this year, but due to the cancellation of the conference I still wanted to create a page with the paper and macro for download to share with everyone. This is a macro that my coworkers and I use commonly every day to assess the demographic distributions in our oncology trials.

Abstract

Clinical trial publications frequently allocate at least one of the allotted tables to summarize the demographics, stratification factors, and other variables of interest of the patients involved with the study. These tables generally include basic distribution information such as frequencies, means, medians, and ranges while also comparing these distributions across a key factor, such as treatment arm, to determine whether there was any imbalance in patient populations when doing analysis. These distributions are not difficult to compute and combine into a table, but as treatments become more specific to patient characteristics such as genetic biomarkers and tumor stages there is a need to be able to display the distributions in subgroups. The macro %TABLEN is a tool developed to compute distribution statistics for continuous variables, date variables, discrete variables, univariate survival time-to-event variables, and univariate logistic regression variables and combine them into a table for publication. %TABLEN has multiple features for comparisons, subgrouping, and outputting to multiple destinations such as RTF and PDF. The macro %TABLEN is a valuable tool for any programmer summarizing patient level data.

Example of demographic summary table

Clinical manuscripts often lead with a summary table of the demographics of patients included in the study to demonstrate a balance between arms and other subgroups. The format is rather standard, and the TABLEN macro focuses on creating tables in this format:

%tablen(data=example, by=arm, 
     var=age date_on sex race1 smoke_st num_met,
     type=1 3 2, outdoc=~/ibm/example1.rtf);

These tables include demographic background variables along with key study factors and show the distributions in order to assess whether the variables are balanced across study arms.

What makes this macro special?

Given how often these tables are used in clinical manuscripts it is safe to say that multiple data centers already have in house programs (Mayo Clinic sure did) to make these tables, so what differentiates %TABLEN from what might already exist?

Five different variable types

The TABLEN macro has the capability to summarize five different types of variables: continuous, discrete, date, survival and logistic. Each variable type has individual options to customize the output. These are specified with the VAR and TYPE macro options. The VAR option holds the list of variables and the TYPE option specifies which type they are with the following options:

1 = Continuous
2 = Discrete
3 = Date
4 = Survival
5 = Logistic

Numeric and Date type variables

These can include variables such as age, laboratory values, and date of randomization. Both numeric and date type variables are included here because they are summarized the same way, but different formats are applied to the results. The statistics that are shown are controlled by either the CONTDISPLAY or DATEDISPLAY macro options. Any number of the following keywords can be listed in either option, and are displayed in the order they are listed in the output:

N: Number of non-missing records
NMISS: Number of missing records
N_NMISS: Combines N and NMISS as N (NMISS)
MEAN: Mean of the distribution
SD: Standard Deviation of the distribution
MEAN_SD: Combines Mean and SD as MEAN (SD)
MEDIAN: Median of the distribution
RANGE: Range of the distribution as MIN, MAX
IQR: Interquartile range of the distribution as Q1, Q3
MEDIAN_RANGE: Combines Median and Range as MEDIAN (MIN, MAX)
MEDIAN_IQR: Combines Median and IQR as MEDIAN (Q1, Q3)

The macro option DATEFMT controls the format that is applied for date variables, and there are options to control how many significant digits are shown for each variable individually.

%TABLEN(data=example, by=arm, 
    var=age date_on, type=1 3, pvals=1 5,
    contdisplay=n_nmiss mean_sd median_range,
    datedisplay=n median range)

P-values are available for either numeric or date type variables by listing one of the following numbers in the PVALS macro option:

If BY variable is present
- 0 = No p-value
- 1 = Kruskal-Wallis
- 2 = Exact Kruskal-Wallis (long calculation times)
- 3 = Wilcoxon rank sum
- 4 = Exact Wilcoxon rank sum (long calculation times)
- 5 = ANOVA F-test
- 6 = Equal variance two sample t-test
- 7 = Unequal variance two sample t-test
If BY variable is not present
- 0 = No p-value
- 1 = Student T-Test
- 2 = Sign Rank

Discrete type variables

Examples of these variables include gender, tumor staging, and race. These are variables that have a limited range of values. The TABLEN macro has several key options for controlling the output for discrete variables:

DIS_DISPLAY: Determines if counts (N), percentages (PCT) or both (N_PCT) are shown
DIS_ORDER: Allows the user to change the default order of the variable values. The default order is the sorted unformatted values. The user can specify a numeric list of the ordered values to change them. For example the variable GENDER might have the values 'Male' and 'Female' which would have the default order of Female and then Male. The user can specify DIS_ORDER=2 1 in order to change the order to Male then Female. Multiple variables can be distinguished with the pipe (|) symbol: DIS_ORDER=2 1|3 1 2|2 1 3. This option can be used to limit how many values are shown. For example with gender if DIS_ORDER=1 then only Females will be shown. This can be useful to save vertical space in the table.
DIS_INCMISS: Determines whether missing values are included as a valid value for percentages
DIS_PRINTMISS: Determines if number of missing values are printed

%TABLEN(data=example, by=arm, 
    var=smoke_st sex dz_g, type=2,  
    pvals=1 3 2, dis_order=3 1 2|1|)

The following P-values are available with the following codes in PVALS:

0 = No p-value
1 = Chi-square
2 = Fisher's exact
3 = Cochran-Armitage trend test (Either BY or VAR must have 2 levels only)

Survival type variables

The TABLEN macro performs univariate survival analyses where the variable in the VAR list represents the time variable. The following parameters allow the user to determine the rest of the survival model:

SURV_STAT: Matches a survival event variable to the time variable listed in the VAR list
CEN_VL: Determines the value representing the non-event. This must be numeric.
TIMELIST: Specifies time-points for Kaplan-Meier event-free rates. Unique time-points can be specified for each survival type variable. Other parameters allow the time variable to be transformed (TDIVISOR), a unit label to be applied (TIME_UNITS), and the display format (TIMELISTFMT: percentage vs. proportion).

The SURVDISPLAY option controls which statistics are shown and the order with the following keywords:

N: Number of patients
EVENTS: Number of events
EVENTS_N: Combines N and Events as EVENTS/N
MEDIAN: Kaplan-Meier survival median and 95% CI
HR: Cox model hazard ratio and 95% CI
COXPVAL: Cox model Wald p-value comparing parameters
TIMELIST: Kaplan-Meier time-point event-free rates and 95% CI

%TABLEN(data=example, by=arm, 
    var=fu_time pg_time, type=4,  pvals=1 3,
    surv_stat=fu_stat pg_stat, cen_vl=1, tdivisor=30.44 30.44, 
    timelist=6 12|3 6, time_units=months,
    survdisplay=events_n median hr timelist)

Different options can be specified for each survival variable with the pipe delimiter as shown in the TIMELIST option in the above example. The following p-values are available with the following values in the PVALS option:

0 = No p-value
1 = Logrank
2 = Wilcoxon
3 = Cox model type-3 score
4 = Cox model type-3 likelihood-ratio
5 = Cox model type-3 Wald

Logistic type variables

The TABLEN macro performs univariate logistic regression analyses when a BY variable is input where the variable in the VAR list represents the event variable. If a BY variable is not input then the binomial success rate is calculated. The following parameters allow the user to determine the rest of the logistic model:

LOG_EVENT: Determines the event value for the binomial variable in the VAR list.
LOG_BINFMT: Determines if the binomial success rate is calculated as a proportion or percentage.

The LOG_DISPLAY option controls which statistics are shown and the order with the following keywords:

N: Sample size
EVENTS: Number of events or successes
ODDSRATIO: Logistic regression odds ratio and 95% CI
WALDPVAL: Logistic regression Wald p-value comparing parameters
BINRATE: Binomial success rate and 95% CI

%TABLEN(data=example, by=arm, 
    var=gd3, type=5,  pvals=1,
    log_event=1,
    log_display=events_n binrate oddsratio)

The following p-values are available with the specified values in the PVALS option:

When a BY variable is present
- 0 = No p-value
- 1 = logistic regression type-3 Wald p-value
When no BY variable is present
- 0 = No p-value
- 1 = Chi-square p-value
- 2 = Fisher's exact p-value

Options for comparison and subgrouping

There are three different options for comparison or subgrouping: BY, COLBY (column BY), and ROWBY (row BY). These options can be used together or separately to account for different situations.

BY variable options

The BY variable creates one set of statistics for each level of the BY variable and allows the comparison of these measures with metrics such as p-values and hazard ratios. The following are key parameters associated with using a BY variable:

BYORDER: Allows the user to change the order of the BY variable levels without modifying the original data. The order by default is the unformatted values.
BY_INCMISS: Determines if missing values are included as a valid value for comparisons
PVALS: Determines which type of p-value is used for comparison. Different p-values are available for each variable type and are set for each variable in the VAR list individually.
REFERENCE: Specifies the reference level when computing hazard ratios or odds ratios

Column BY (COLBY) options

The COLBY option allows the tables to be nested into subgroups in multiple columns. The variable list, BY groupings, and p-values are summarized within each level of the COLBY variable. There is no limit to how many levels the COLBY variable can have, and this feature can work very nicely when combined with the ODS EXCEL destination.

ROW BY (ROWBY)

The ROWBY option allows the tables to be nested into subgroups in multiple rows. The variable list, BY groupings, and p-values are summarized within each level of the ROWBY variable. There is no limit to how many levels the ROWBY variable can have, and this feature is more useful when horizontal space is limited.

Quality of life features

The TABLEN macro is packed with features to improve quality of life of the programmer using the macro. These features include:

Full error checking for every parameter to stop the macro from crashing and ruining a SAS session. The macro cleans up all temporary data sets that it creates and restores any option values that it changed.
Options such as TYPE carry their last value forward so that values don't have to be repeated. For example with the TYPE variable if there are 5 variables that have the first as continuous and the rest discrete the following two uses of the TYPE option are the same: 1 2 2 2 2 and 1 2. In the second list the 2 will be carried forward for the remaining 3 variables if no further values are specified.
Each variable type's options will only be applied to variables of those types. For example if the first survival variable is the third variable in the VAR list, the first option listed in SURV_STAT will still apply to the first listed survival variable. There is no need to account for the other variable types.
The LABELS option allows the user to specify new labels for variables without changing their data set. The macro takes the current variable label by default.
The WHERE option allows the user to subset their data set without making a new one to run with the macro. This will not affect their original data set.
The macro can output to RTF, PDF, EXCEL, HTML, and POWERPOINT destinations while keeping the same look. The WORD destination is also being reviewed currently. The table is also looks the same in SAS Studio. The following image shows the RTF, PDF, EXCEL, SAS Studio, and POWERPOINT destinations in order:

The TABLEN macro also makes a table for the LISTING environment that works well with .LST files:

P-value footnotes are automatically created for the table along with the super scripts. These can also be turned off.

Other style options

The TABLEN macro also has several options for changing the shading, spacing and borders of the final table output.

Table shading

Table shading can either be disabled, shade alternating rows, or shade alternating variables. This is controlled by the SHADING parameter in the TABLEN macro call. The following are examples of disabling the shading and shading by row. The default of SHADING=1 to have shading by alternating row has been displayed in the prior table examples.

Table spacing

Within the table there are three options for the space between variables: adding a blank row, adding a line, and having no space or line between variables. This option is controlled by the SPLIT parameter in the TABLEN macro call. The default of SPLIT=SPACE has been displayed in the prior table examples and will add a space between each variable.

Table borders

The amount of cell borders is controlled by the BORDERDISPLAY parameter in the TABLEN macro call and has four different options. The default of BORDERDISPLAY=1 only shows border lines at the top and bottom of the table and has been used in the prior table examples. The following examples show the other options: BORDERDISPLAY=2 turns on all cell borders, BORDERDISPLAY=3 turns on all cell borders except for title and footnote, and BORDERDISPLAY=4 turns on borders for columns.

Conclusion

The %TABLEN macro is a flexible and powerful tool for creating demographic summary tables, and has the features required to meet the additional challenges put forth by more complicated clinical trials. The ability to handle multiple variable types, output to several ODS destinations, create nested tables, and perform on the fly data set modifications is useful to any programmer in the healthcare industry.

The macro is attached to this article, and the paper and presentation will be added as they are finalized. Any questions or macro related issues can be sent to:

Name: Jeffrey Meyers

Enterprise: Mayo Clinic

E-mail: Meyers.jeffrey@mayo.edu / jpmeyers.spa@gmail.com

Jianmin · ‎05-22-2020

Hi Jeffery, nice job on the paper and the macro, and you don't need another person to tell you how useful they are. Medical writers will love it for your quick results and they will hate it for your taking away their jobs.

A little comment on the default 2.4.2: Results Digit Rounding. In the industry, a general note to programmer is something like this: for continuous variable, one more digits for mean and median, 2 more digits for SD than the input value, same number of digits for Min and Max as the input value; no more than 3 digits unless otherwise. For example, Age is a whole number in clinical trials and reporting, so the Min and Max or Range will also be whole number in the report. Your macro has an option for this, but it's nice to make it a default. The default digit for % is 1.

Many thanks for your work.

Best Regards,

Jianmin

HermanoRocha · ‎07-13-2020

Dear, @JeffMeyers ,

Thank you very much for your great work and for sharing.

Do you have plans for adding multivariate options?

I´m editing my comment because I figured out that even with this error the .rtf file is properly generated, so it´s not a real problem.

Could you please help me with an issue? I'm able to generate descriptive tables with no problems, but I always get the error:

ERROR: Invalid characters were present in the data.

When trying to make BY tables. I'm trying to use a very simple dataset for testing, just two categorical variables, but I'm having no luck. Tried to code the variables, but it didn´t work either. I'm using SAS university version on virtual machine.

error:

4014 %tablen(data=WORK.MERGE0, by=motalta,

4015 var=SEXO,

4016 type=2, shading=0, outdoc=~/example1.rtf);

ERROR: Invalid characters were present in the data.

ERROR: Une erreur s'est produite lors du traitement des données texte.

TABLEN has finished processing, runtime: 0:00.406

4017

4018 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;

4030

Thank you for your attention,

JeffMeyers · ‎07-14-2020

Hello @HermanoRocha,

Thank you for trying out the macro! I have played a bit with multivariate options internally but have not incorporated them into the main macro as of yet. I have actually seen the issue you have posted a few times now from other people using the University Edition and I discovered it was caused by the way I made a non-breaking space character. In the newer version of the macro that I just uploaded (072020 version) I changed this so that it should work with the University Edition. Please let me know if this fixes the issue for you.

HermanoRocha · ‎07-14-2020

Hi @JeffMeyers ,

Thank you very much for your attention and fast response, it worked!

I´ve discovered your mvmodels macro %MVMODELS: a Macro for Survival and Logistic Analysis in another post, and was able to make the multivariate tables, still working with it but having really nice results.

Best regards

markus24135 · ‎02-15-2021

Hey @JeffMeyers , I have been using this since October 2020 and it works amazing!!

This year though, actually I should say this month, the macro as shown below (just for categorical variables with a chi-square) isn't working:

%TABLEN(data=test, by=case,

var dichot_persis, type=2,

pvals=1);

This is on SAS Enterprise Guide Version 7.15 HF8 (7.100.5.6214) 32-bit.

The output it gives is just the column headers with the correct N= under each level of case (which is just 0 and 1) and the column header "P-value" but no rows under all of that.

Immediately under that you can see "1Chi-Squarep-value;" though which means that part is working.

Any idea what could be going on?

Thank you!!!

JeffMeyers · ‎02-17-2021

Hello @markus24135 ,

Thank you for using my macro. I wanted to clarify what is in your data currently. In order to get a Chi-square value you need at least a 2x2 table. If you want to follow-up with me feel free to reach out at meyers.jeffrey@mayo.edu.

If you run the table you want in PROC FREQ do you get a p-value?

sirin_md · ‎03-04-2021

Dear @JeffMeyers ,

Trying to download the latest version 072020. Seemed like the link did not work.

Thx,

@sirin_md

JeffMeyers · ‎03-11-2021

@sirin_md Thank you for letting me know. I'm not sure why two of the attached files are linking to html files instead. Any ideas @ChrisHemedinger ? Should I try to reupload them?

In the meantime you can e-mail me at meyers.jeffrey@mayo.edu and I can send you the most recent version directly.

ChrisHemedinger · ‎03-11-2021

@JeffMeyers I don't know why that one file isn't working -- yes, if you can reupload, that's probably the quickest fix.

sirin_md · ‎03-13-2021

@JeffMeyers the latest version works beautifully with SAS ODA & Univ Edition. Thank you so much for the great work.

HermanoRocha · ‎04-12-2021

Dear @JeffMeyers

Thank you again for the great macro, I really loved it. Would you know if some similar macro would be available for making graphs following the tables (bar for categorical e box plot for numerical variables)?

Best wishes

JeffMeyers · ‎04-23-2021

Hello @HermanoRocha Unfortunately, I am not aware of a macro that also makes the figures you're describing.

Vladislav27 · ‎05-03-2021

Hellot @JeffMeyers
Great work, it is amazing macro! Could you please tell me if it is possible to leave strings with zeros for discrete variables?

JeffMeyers · ‎05-05-2021

Thank you @Vladislav27. There is only one way that I'm aware of with the macro and that is if you make use of the FREQ macro option. For example:

data x;
factor='Test 1';n=5;output;
factor='Test 2';n=4;output;
factor='Test 3';n=0;output;
run;
%tablen(data=x,var=factor,type=2,freq=n);

The values with 0 counts will be included when using this method. You will have to calculate the counts prior to running the macro.

KDS1 · ‎05-20-2021

Hi @JeffMeyers

Just a quick note to say how much I appreciate you sharing your brilliant macros with the SAS community!

I have a quick question - I can't share the data due to confidentiality issues, but it seems the PVALS option (3) for Wilcoxon Rank Sum, for continuous variables, doesn't work for me. For example the following code, doesn't produce p-values for the 3 continuous variables.

If I replace the 3s in the PVALS parameter with 4s, it does attempt to calculate the exact Wilcoxon Rank Sum.

Similarly if I replace the 3s with 2s it attempts to calculate exact Kruskal Wallis. But using 3s, no pvalues are produced - there no warnings or errors.

%tablen(data=final, by=new_class, colby=group, orientation=landscape, destination=EXCEL, out=output,
var=ageatindex sex race region practicetype yearof1L insurancetype smokingstatus charlsonscoreatbaseline
durationfromadvdxtoindexindays histology ecogatindex egfrmutationtype,
type=1 2 2 2 2 2 2 2 1 1 2 2 2, pvals=3 1 1 1 1 1 1 1 3 3 1 1 1,
labels=Age|Sex|Race|Geographic region|Physician practice type|Year of initiation of 1L therapy|Insurance type|Smoking status|NCI-Charlson comorbidities|Time from Stage IV NSCLC diagnosis to index date (days)|Histology|ECOG grade|EGFR mutation type,
contdisplay=n_nmiss mean_sd median_range, debug=1,
outdoc=/sascommon/sasusers/&un./RWE/GMD/ODSI/&pd./output/test.xlsx);

JeffMeyers · ‎05-20-2021

Hello @KDS1, I am wondering if your BY variable has more than 2 levels? SAS by default will use the Kruskal Wallis test instead of Wilcoxon when you have more than 2 groups to compare in the NPAR1WAY procedure, so the macro cannot pull that p-value in this scenario.

KDS1 · ‎05-21-2021

Thanks Jeff, you were absolutely right I had 3 levels on BY variable.

May I make a suggestion for an enhancement - I've tried to use the OUTPUT= dataset which has everything one needs to take control of output formatting except the N= number at the top of the column.

Would it be possible to add this?

PharmlyDoc · ‎07-18-2021

Hi @JeffMeyers

This is a fantastic macro!

Is there a way to have two different options for a continuous variable in the same table?

For instance, I have a row for age as CONTDISPLAY=MEAN_SD and then a row below it for median income as CONTDISPLAY=MEDIAN.

JeffMeyers · ‎07-19-2021

Hello @PharmlyDoc. I can't remember if the previous version could do that so I uploaded a more recent version (20210718) that can for sure. Just use CONTDISPLAY with | as the delimiter. For example:

%tablen(data=sashelp.class,var=height weight,type=1,contdisplay=n_nmiss median range|median iqr range);

The last value of CONTDISPLAY will carry forward to remaining type 1 variables.

@KDS1 The newer version I mentioned in this comment also updated the output dataset so that the META=1 row includes the full column headers (with counts).

user021 · ‎08-06-2021

Hi @JeffMeyers

Thanks for the awesome code.

1. It seems the option PVALDIGITS doesn't work well for me. When I add the option PVALDIGITS=2, all p values including the threshold are in 2 digits, while I want the threshold to be displayed as "<0.0001".

Can you let the macro treat p value less than threshold and the rest differently by something like adding an option for threshold (threshold =[0.05, 0.001, et al])?

2. Is there a way to do pairwise comparison when the by group is more than 2?

JeffMeyers · ‎08-07-2021

Hello @user021,

1. The macro uses the format PVALUE to control the display of p-values. The PVALDIGITS option limits the width of this format (e.g. PVALUE6.2 when PVALDIGITS=2). If you create the format you would want to use it would be pretty straightforward to modify the macro code by searching for this format in and replacing it with the format that works for what you want it to do.

2. The macro does not currently do that. The only way I could think to get close would be to make use of COLBY. Replicate the dataset where each replication only keeps 2 of the levels and add a group variable for which replication it is, then use that grouping variable in the COLBY option of the macro.

KDS1 · ‎08-10-2021

@JeffMeyers Thanks for the update and enhancements. That's awesome!

@user021For your 2nd request, If I've understood your requirement correctly, you could run the macro twice or as many times as you want, each time with the pair of interest, and use the resulting output datasets that Jeff has enhanced. Slice and dice and merge the columns as you wish. Apologies if I misunderstood your requirement.

JeffMeyers · ‎08-18-2021

Hello @PharmlyDoc,

The newer version added a feature that when only one value is displayed for a variable it is merged into the same row as the label to save space. This was requested within my work unit. In the future I can add an option to override this but in the meantime you can search for this bit of code:

%if %sysfunc(countw(%scan(%superq(contdisplay),&i,|,m),%str( )))=1 %then %do; "&&contfactor&i" %end;
%else %do; factor %end;

and set it to this to override it:

factor

PharmlyDoc · ‎08-18-2021

Hi @JeffMeyers

What am I setting to factor?

JeffMeyers · ‎08-18-2021

@PharmlyDoc replace or comment out the code in the first box and replace it with the word factor instead.

JeffMeyers · ‎08-19-2021

@PharmlyDoc, to stop the label from appending the stat you'd need to comment out this portion in the data _tester_ step:

if %scan(&type,&i,%str( ))^=2 and dim(_ndisplay_&i._)=1 then factor=strip(factor)||', '||strip(_ndisplay_&i._(1));
else if %scan(&type,&i,%str( ))=2 and dim(_ndisplay_&i._)=2 and %sysfunc(countw(%scan(%superq(dis_order),&i,|,m),%str( )))=1 and 
     %sysfunc(find(%scan(%superq(dis_order),&i,|,m),freq,i))=0 then factor=strip(factor)|| ' ('||strip(_ndisplay_&i._(1))||')';

xiangzaier · ‎08-29-2021

thank you,it is a very useful macro but if i want to display the statistics like chi-square or t value?

user021 · ‎09-09-2021

@JeffMeyers

I followed your advice on the pairwise comparison by duplicating corresponding group then using the COLBY option. The summary table is created successfully, but some columns are duplicated due to some group are duplicated. I want to remove those duplicated column and create a summary table look like below where pairwise p-values are on the right half. any advice?

2. I tried SHOWTOTAL=0 and the column for Total is gone, but when I used the option pvals=0, a blank column for p-value is still there. Is it supposed to be like that?

whymath · ‎12-26-2021

@JeffMeyers Really good job, may I ask a question?

In your source code, why replace "%if %superq(xxx)^= %then ..." with "%if %sysevalf(%superq(xxx)^=,boolean) %then ..." ? What are the advantages of the latter?

sskt · ‎02-02-2022

Hi @JeffMeyers

I am very impressed with your beautiful and easy to use program %tablen. Thanks to it, I can output beautiful tables without working overtime.
I have two questions:

1. When I use ROWBY to aggregate Discrete types, I get a lot of rows with zero entries in all columns, but I don't want to display them.
Is there any way around this by using INCMISS or PRINTMISS variables? If so, I would appreciate it if you could show me an example.

2. As with your other macros, this macro implements DESTINATION, OUTDOC, EXCEL_SHEETNAME, etc.
Is it possible to output multiple tables to different sheets of the same Excel file using these functions? I think it is impossible because of ODS, but please let me know if there is a way.

JeffMeyers · ‎03-21-2022

Hello @sskt

Thank you for your kind words. Sorry I have not been able to login to here for some time. Feel free to e-mail me in the future (jpmeyers.spa@gmail.com).

For number 1 I don't have an option at this time, but it is something I would like to implement in the future. I need to figure out the code to check if row is entirely zeros or not and it becomes complicated with COLBY as well.

For number 2 instead of using the OUTDOC option wrap the macro with ODS tags.

for example:

ods rtf file='some file path';

%tablen(...)

ods rtf close;

This can be done with any ODS destination including ODS EXCEL.

JeffMeyers · ‎03-21-2022

Hello @whymath

I have found better luck overall using the %sysevalf function when it comes to comparing to null values or comparing macro text strings. I have run into issues several times with errors such as "%then written as text" which can crash the session.

Bravez · ‎03-26-2022

Hi @JeffMeyers ,

Very impressive work for increasing efficiency in building such tables by wrapping a lot of tedious activities in one big macro. Just some questions:

does it allow multiple groups (e.g. 3 groups or more, and multiple group comparison)?
does it allow to do both mean/median etc. and count/percent (with customized categories) directly from a continuous variable? or it requires extra preparation, two separated variables, in the input dataset?

Side question: I remember once I've seen somewhere a "HSR" summary macro from Mayo for outputting complex tables in rtf/word, etc. Is there any SAS paper on this or you could share some info?

Thanks a lot.

JeffMeyers · ‎04-07-2022

Hello @Bravez

I'm not sure what you mean exactly by multiple groups. There can be as many levels in the BY variable as you need. If you are trying to have 3 groups and get a p-value to compare 1 vs 2 and 1 vs 3 and 2 vs 3 there isn't a direct way to do that. You could setup a COLBY variable to group 1&2 into a column, 1&3 into a column, and 2&3 into a column to get separate p-values, but then you'd have each group repeated twice.

Your second idea is something the macro cannot currently do, but sounds like an interesting addition I could add to the macro at some point.

The older HSR macro was called %table or %summary I believe. I made this macro to be an improvement on the features of the %table macro since the programming was pretty outdated. I think the %summary macro can still be found at this link (https://bioinformaticstools.mayo.edu/?category=20) which has some macros that Mayo shares externally.

Bravez · ‎04-07-2022

Hi @JeffMeyers ,

Thanks a lot! Actually I have tried to find this link several times from the Mayo website. You made my day. 🙂

sskt · ‎05-23-2022

Hi @JeffMeyers,

Thanks to your help, the Excel output worked well.
Thank you very much!

One more question, is there any way to output multiple tables into one ppt file?
Currently, it doesn't work and we are trying to output several ppt files and combine them in Powershell.
We are also having trouble with one table sticking out of the slide and spanning two slides.

JeffMeyers · ‎05-23-2022

Hello @sskt ,

When you want to output multiple tables to the same document make sure that you are not using the OUTDOC option. Put the macro calls within an already existing ODS PDF such as the following:

ods pdf file='all_tables.pdf';
%tablen(...)

%tablen(...)

%table(... )

ods pdf close;

For the table that is too large there are a couple of options.

1) Change to landscape with options orientation=landscape in the macro call

2) Change the font size and column width options in the macro to make the text and table widths smaller

3) Before your ods pdf change the pagesize option: options pagesize=(10in 12in); (look up the syntax prior). This will change the size of the whole document but will allow you to fit your tables. It will probably mess up printing though if you are going to try to print documents.

Crystal_F · ‎05-30-2022

Hello @JeffMeyers

Thanks for sharing your great work. It makes my life much easier with your macro. May I ask if it's possible to provide 95% CI for continous variable as well? It would be great if the table could provide N and % in seperate cell, which would be even better when people want to generate figures using the output data. I wish I could make some adjustment of your program to achieve it. But it's out of my capability. Any suggestion will be greatly valued. Thank you!

user021 · ‎06-02-2022

Hello @JeffMeyers,

Any chance a row total can be added? Right now the row total is included in the column label as (N=xxx). It's a pain to transfer those into a row total.

Thank you!

JeffMeyers · ‎07-11-2022

Hello @user021 ,

Do you have an example of the type of row total that you're trying to make?

Hello @Crystal_F ,

Sorry for the delay in response. I recently started a new position and haven't had as much time to respond here. Which continuous metric are you wanting the CI for? I typically build these based on my personal experiences and haven't had to make many CI for continuous variables. I don't really have the ability to separate N and % into separate columns due to the nature of needing 1 column to store several types of output types, but I could alter the output dataset in the future to save the numeric values for the component variables. This would lead to the output dataset becoming much larger horizontally.

User04 · ‎08-05-2022

Hello @JeffMeyers,

I have a problem with the macro variable BY_INCMISS. I have missing values in the BY variable (3 categories). I do not want to take the missing category into account when calculating the p-values of the Kruskal-Wallis test. However, no matter how I set BY_INCMISS, the missing category is always taken into account. I have an additional ROWBY variable. The same occurs when I specify an additional COLBY variable.

Do you have any suggestions?

Thank you!

user021 · ‎08-26-2022

Hello @JeffMeyers,

I think @Crystal_F meant to say 95% CI or CL (Confidence Limit) of MEAN for continuous variable. It is generated with CLM option of proc means. Can you add it as options of CL and MEAN_CL in this macro?

Thank you!

Crystal_F · ‎09-11-2022

Thank you for your response! @JeffMeyers Yes. I was asking for CL of mean as pointed out by @user021 . Congratulations on the new position!

Brownbrown21 · ‎09-13-2022

Hi Jeff,

Thanks a lot for the great macro! You have no idea how much time you've saved me with this!!

Quick question - is there an option to reorder the columns such that "Total" comes first? I've only figured how to reorder the variable columns, but not the total.

Thanks again!

JeffMeyers · ‎09-13-2022

Hello @Brownbrown21

thank you for using my macro. The option you're looking for is TOTAL_COL_POS=start. If the version you have doesn't have this option please e mail me at jpmeyers.spa@gmail.com.

Brownbrown21 · ‎09-13-2022

Thank you for the prompt response, you're such a lifesaver! I just downloaded version 2.44 07/14/2021 and it's working well. Greatly appreciate your help!

user021 · ‎09-28-2022

@JeffMeyers

I want only Yes values of a Yes/No variable and I specified DIS_ORDER=2. It works, but can we have such an option to merge the row of Yes with the row of variable name in this scenario?

JeffMeyers · ‎10-07-2022

Hello @user021 . I don't know if the last version I updated on this page has this feature but my local one does. If you e-mail me at jpmeyers.spa@gmail.com I can send the most recent version that I have. I don't currently use the macro in my current position so I haven't updated it in a while.

sskt · ‎10-26-2022

Hello, @JeffMeyers .

I believe that many people, including myself, want a new version of the program, so I would appreciate it if you could upload an additional, more up-to-date version to this page.

beloko · ‎10-27-2022

Hi @JeffMeyers ,

Thanks for sharing tablen. You should never have to pay for lunch when a SAS programmer is near 🙂. I'm attempting to add standardized difference as an option for comparing groups. Per chance have you already done this and are just about to hit "post"? If not, any tips?

I'm planning to leverage a macro from the Cleveland Clinic:

https://support.sas.com/resources/papers/proceedings12/335-2012.pdf
https://www.lerner.ccf.org/qhs/software/lib/stddiff.sas

Thanks again!