BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Zaichik17
Calcite | Level 5

Hello,

I'm new to SAS and I'm having trouble creating a nice looking summary table. This is for a homework project but I would like it to look presentable like in a journal.  Below is a sample of my data and what tables I currently have.  I used Proc Tabulate and ODS output did produce a pretty okay looking table but ideally, I would also want to merge a Fisher exact P value that I got from PROC Freq onto that table.  I have done this previously with PROC SQL but i needed to Proc transpose first and I cannot figure out how to do this with this data.  Does anyone have simpler suggestions?

 

PROC FORMAT;
VALUE Sex 2='Female' 1='Male';
VALUE Race 5='White' 4= 'Asian' 2 ='African American' 1='Native American';
VALUE Smoker 1='Yes' 0='No' . = 'Unknown';
VALUE Trtgroup 1='Placebo' 2='Untreated/no gel' 3='Low Dose' 4='Medium Dose' 5 ='High Dose';
VALUE Age
. = 'Unknown'
LOW - 45 = '< 45 years'
45 - 55 = '45 - 55 years'
55 - HIGH = '> 55 years';
RUN;
DATA studydata;
INFILE Datalines dlm =",";
INPUT id trtgroup sex race age smoker;
FORMAT sex sex. race race. smoker smoker.
trtgroup trtgroup. age age.;
DATALINES;
100,3,1,5,56.12867899,1
101,5,1,5,53.88090349,1
102,1,1,5,39.29089665,1
103,2,1,5,46.55167693,1
104,3,1,5,69.82340862,1
105,4,1,5,54.48323066,1
106,4,1,5,55.09377139,1
107,5,1,5,53.25941136,1
108,2,2,5,41.01574264,1
109,3,2,2,44.13689254,0
110,1,2,5,30.40383299,0
111,2,2,5,59.07186858,0
112,3,2,5,61.2019165,0
113,5,2,5,64.60232717,0
114,4,2,5,66.86379192,0
;
RUN;
/* ods trace on; */
ods output CrossTabFreqs=freqtable1;
ods output Fishersexact = fishertest1;
PROC FREQ DATA=studydata;
TABLE trtgroup * (sex race age smoker) /outpct MISSPRINT FISHER;
RUN;
/* ods trace off; */

PROC PRINT DATA= freqtable noobs;
RUN;

PROC PRINT DATA = fishertest;
RUN;

*Descriptive Statistics for demographic data by treatment groups*;

ODS PDF FILE = "&project1/Canstudy/Documents/Table1.PDF";
/* ods trace on; */
ODS OUTPUT Table=SummaryTable1_pre;
PROC TABULATE DATA = studydata MISSING;
CLASS trtgroup sex race age smoker;
TABLE sex race smoker age, trtgroup = 'Treatment Group' * (N COLPCTN = '%'* F=5.1)
/BOX = 'Characteristics';
RUN;
/* ods trace off; */
ODS PDF CLOSE;

PROC PRINT DATA=SummaryTable1_pre1;
RUN;

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

I would recommend this macro

https://communities.sas.com/t5/SAS-Communities-Library/Demographic-Table-and-Subgroup-Summary-Macro-...

 

PROC FORMAT;
VALUE Sex 2='Female' 1='Male';
VALUE Race 5='White' 4= 'Asian' 2 ='African American' 1='Native American';
VALUE Smoker 1='Yes' 0='No' . = 'Unknown';
VALUE Trtgroup 1='Placebo' 2='Untreated/no gel' 3='Low Dose' 4='Medium Dose' 5 ='High Dose';
VALUE Age
. = 'Unknown'
LOW - 45 = '< 45 years'
45 - 55 = '45 - 55 years'
55 - HIGH = '> 55 years';
RUN;
DATA studydata;
INFILE Datalines dlm =",";
INPUT id trtgroup sex race age smoker;
FORMAT sex sex. race race. smoker smoker.
trtgroup trtgroup. age age.;
DATALINES;
100,3,1,5,56.12867899,1
101,5,1,5,53.88090349,1
102,1,1,5,39.29089665,1
103,2,1,5,46.55167693,1
104,3,1,5,69.82340862,1
105,4,1,5,54.48323066,1
106,4,1,5,55.09377139,1
107,5,1,5,53.25941136,1
108,2,2,5,41.01574264,1
109,3,2,2,44.13689254,0
110,1,2,5,30.40383299,0
111,2,2,5,59.07186858,0
112,3,2,5,61.2019165,0
113,5,2,5,64.60232717,0
114,4,2,5,66.86379192,0
;
RUN;

filename tableN url 'https://communities.sas.com/kntur85557/attachments/kntur85557/library/4477/10/tablen_web_20210718.sas';

%include tableN ;

filename tableN;

%tablen(data=studydata, by=trtgroup, 
     var=sex race smoker,
     type=2, pvals=2, outdoc=/home/fkhurshed/Demo1/example1.rtf);

Results:

 

 
 
  trtgroup  
  Placebo
(N=2)
Untreated/no gel
(N=3)
Low Dose
(N=4)
Medium Dose
(N=3)
High Dose
(N=3)
Total
(N=15)
P-value
sex, n (%)             1.00001
Male 1 (50.0%) 1 (33.3%) 2 (50.0%) 2 (66.7%) 2 (66.7%) 8 (53.3%)  
Female 1 (50.0%) 2 (66.7%) 2 (50.0%) 1 (33.3%) 1 (33.3%) 7 (46.7%)  
               
race, n (%)             1.00001
African American 0 (0.0%) 0 (0.0%) 1 (25.0%) 0 (0.0%) 0 (0.0%) 1 (6.7%)  
White 2 (100.0%) 3 (100.0%) 3 (75.0%) 3 (100.0%) 3 (100.0%) 14 (93.3%)  
               
smoker, n (%)             1.00001
No 1 (50.0%) 1 (33.3%) 2 (50.0%) 1 (33.3%) 1 (33.3%) 6 (40.0%)  
Yes 1 (50.0%) 2 (66.7%) 2 (50.0%) 2 (66.7%) 2 (66.7%) 9 (60.0%)  
1Fisher Exact p-value;

 

EDIT: added in the code, and fixed the code because filename references cannot be longer than 8 chars.

 

View solution in original post

8 REPLIES 8
PaigeMiller
Diamond | Level 26

I think the only way to do this is to merge the Fisher's exact test statistics with your data set named SummaryTable1_pre (or maybe with studydata) and create another table.

--
Paige Miller
ballardw
Super User

@Zaichik17 wrote:

Hello,

I'm new to SAS and I'm having trouble creating a nice looking summary table. This is for a homework project but I would like it to look presentable like in a journal. 


The portion of your question that I highlighted would point me to one or more report procedures, Proc Print, Report or Tabulate to create the final output. These procedures are designed to have features that make "pretty" output. While many of the procedures in SAS will make reasonable tabular output they usually do not have as many features available as the report procedures.

If these can't do what you want you may want the much finer control, with associated learning curve and more complex coding, of the data step Report Writing Interface.

 

Which procedure and how to use it depends a lot on what you want to convey. If you want everything to appear in a single "table" for output then likely that means that you will need to place all of the data into a single data set so all the values are available when needed. How to create that data set will depend on the desired final table and the procedure you want.

 

Combining summary information, such as Fishers test results, with other data often points to summarizing the data before combining the separate elements. One reason for that is that you can manually examine the result to see if everything aligned correctly. Another is that you will know the values that should appear in the report.

 

Without a pretty concrete example of what you expect the output to look like only generic comments like the above can be provided.

Example data, in the form of a data step, and some idea of what you want as output from that data we can possibly provide much more detailed answers and discuss options you may not have considered.

Zaichik17
Calcite | Level 5

I would like everything to appear in one table.  This is what proc tabulate looks like with the code I provided in my question:

Zaichik17_0-1662850431818.png

I would then like to have N(%) instead of them separate but Proc Tabulate will not do that.  In addition, I would like to have a column to the far right with the Fisher test p-value basically like this but with all 5 treatment groups:

Table 1. Characteristics of the Study Sample, by treatment group

 

Treatment Group

F.E. 

p-value

 

 

(N = 26)

 

 

Sex (N, %)

Male

 

Female

 

 

 

Race (N, % )

White

 

African American

 

Asian

 

Native American

 

 

 

Smoking Status (N,%)

  Smoker

 

Non-smoker

 

 

 

Age group

 

 

 

  <45 years (N, %)

 

 

 

  45-55 years (N, %)

 

 

 

  >55 years (N, %)

 

 

 

 

I found several papers such as this one: http://www.lexjansen.com/mwsug/2011/pharma/MWSUG-2011-PH02.pdf

Which shows how I would like my table to look, this example only has 2 groups and a total, I would just need 5 treatment groups and a p value instead.  I could not follow the instructions in the paper for my data, it was too complex for me at the moment.  I do not know how to use PROC REPORT but I think it would be able to produce this result, however since it works in COLUMNS I don't know how variables sex, age, race, and smoker would need to be coded to do that.  Or I could continue with the PROC tabulate table I already have, to just merge the FE value from Proc freq but I think I need to transpose that table first?  I cannot figure out how to use proc transpose though on the proc tabulate summary table.

Cynthia_sas
SAS Super FREQ

Hi:
To produce the kind of output you want with PROC REPORT, you'd have to manipulate your data to be in the form you wanted. For example, if you take a look on page 9 and 17 of this paper https://support.sas.com/resources/papers/proceedings/pdfs/sgf2008/173-2008.pdf You'll see what I call a "demographic" report in the paper. Page 9's reports were produced with PROC REPORT and page 17 was produced using a DATA step and custom table template. The data for the report on the left side of page 9 was heavily manipulated before the PROC REPORT step to take data from PROC FREQ and PROC MEANS and bash the data all together with some helper variables to create the final report in PROC REPORT. As an example, here's what that data started as and what the final data looked like:

Cynthia_sas_0-1662854467146.png

Then the file on the right side of the page 9 started out looking like Patient_Info except the procedures that were used to make the final WORK.TREATDATA file were slightly different to get the P value and the OddsRatio/CI. But the same concept applied -- the data was restructured and helper variables were created so the nice formatting could be done.

Cynthia_sas_1-1662854503810.png

For your purposes, I think getting your Fishers statistic merged with your other summarized data and then passed to PROC PRINT would probably be OK for your class assignment. I would not consider either of the programs I used to be necessary for a homework assignment. Might be worth the work for a thesis or dissertation, but it depends on whether the focus of class is on the statistics and learning SAS or on making things pretty.

 

  By the way, good job on figuring out PROC TABULATE to do your first pass. Note that PROC TABULATE will create an output data set for you if you use the OUT= option on the PROC TABULATE statement. That resulting table will not be in the same tabular structure as the TABULATE output, but you might be able to use it in a merge with your other procedure's output. As an example, if you take a simple example of using PROC TABULATE for SEX and AGE in SASHELP.CLASS, you can get this from the use of OUT=:

Cynthia_sas_2-1662855256099.png

Notice how TABULATE gives you a very nice helper variable called _TYPE_ which tells you which CLASS variables or combination of class variables were used to calculate the N and PCTN for those rows.


Cynthia

Zaichik17
Calcite | Level 5

Hi Cynthia,

 

Thank you for your reply, PROC print seems to be out of the scope of my knowledge and yes with this is for a data analysis class and I don't need to go to that extent.  I am trying to clean my data so that I can merge the Fisher value, I decided to try a different approach and use proc freq and proc transpose each variable separately and then I combined them with PROC SQL, however my demographic variables reverted back to their original values( 1/2 for sex, 2/5 for race, 0/1 for smk, and age is average) do you know how I can have the variables show as Female/male, white/africanamerican...etc again?  As well as include Sex,race,smoker and age as headers separating them?Here is what the PROC SQL looks like:

 

Zaichik17_0-1662999455497.png

 

My code:

PROC FORMAT;
VALUE Sex 2='Female' 1='Male';
VALUE Race 5='White' 4= 'Asian' 2 ='African American' 1='Native American';
VALUE Smoker 1='Yes' 0='No' . = 'Unknown';
VALUE Trtgroup 1='Placebo' 2='Untreated/no gel' 3='Low Dose' 4='Medium Dose' 5 ='High Dose';
VALUE Age
. = 'Unknown'
LOW - 45 = '< 45 years'
45 - 55 = '45 - 55 years'
55 - HIGH = '> 55 years';
RUN;
DATA studydata2;
INFILE Datalines dlm =",";
INPUT id trtgroup sex race age smoker;
FORMAT sex sex. race race. smoker smoker.
trtgroup trtgroup. age age.;
DATALINES;
100,3,1,5,56.12867899,1
101,5,1,5,53.88090349,1
102,1,1,5,39.29089665,1
103,2,1,5,46.55167693,1
104,3,1,5,69.82340862,1
105,4,1,5,54.48323066,1
106,4,1,5,55.09377139,1
107,5,1,5,53.25941136,1
108,2,2,5,41.01574264,1
109,3,2,2,44.13689254,0
110,1,2,5,30.40383299,0
111,2,2,5,59.07186858,0
112,3,2,5,61.2019165,0
113,5,2,5,64.60232717,0
114,4,2,5,66.86379192,0
;
RUN;

proc print data=studydata2;
run;
*desriptive statistics for sex*;

ods output crosstabfreqs=sexcross; *Output dataset for crosstabs;
ods output fishersexact=sexfishers; *Output dataset for Fisher's Exact test;
proc freq data=studydata2;
tables sex*trtgroup/ fisher outpct;
run;

proc print data = sexcross;
run;
proc print data= sexfishers;
run;

data sexcross1;
set sexcross;
if _type_='11';
keep sex trtgroup frequency colpercent;
run;
*Modify N and Percent to create character formats for PROC REPORT;
data sexcross2;
set sexcross1;
length n_perc $12;
n=put(frequency,3.);
perc=put(colpercent, 5.1);
n_perc=(n||' ('||perc||'%'||')'); *Variable in form xx (xx.x%);
run;
proc sort data=sexcross2;
by sex trtgroup;
run;
*Switch orientation;
proc transpose data=sexcross2 out=newsexstats1;
by sex ;
id trtgroup;
var n_perc;
run;
* combine with fisher exact*;
data sexcross_fish;
merge newsexstats1 fishers(where=(name1='XP2_FISH') keep=name1 cvalue1);
prob=put(cvalue1,5.3);
drop name1 cvalue1;
run;

*smoker descriptive statistics*;
ods output crosstabfreqs=smokercross; *Output dataset for crosstabs;
ods output fishersexact=smokerfishers; *Output dataset for Fisher's Exact test;
proc freq data=studydata2;
tables smoker*trtgroup/ fisher outpct;
run;

proc print data = smokercross;
run;
proc print data= smokerfishers;
run;

data smokercross1;
set smokercross;
if _type_='11';
keep smoker trtgroup frequency colpercent;
run;
*Modify N and Percent to create character formats for PROC REPORT;
data smokercross2;
set smokercross1;
length n_perc $12;
n=put(frequency,3.);
perc=put(colpercent, 5.1);
n_perc=(n||' ('||perc||'%'||')'); *Variable in form xx (xx.x%);
run;
proc sort data=smokercross2;
by smoker trtgroup;
run;
*Switch orientation;
proc transpose data=smokercross2 out=newsmokerstats1;
by smoker ;
id trtgroup;
var n_perc;
run;
* combine with fisher exact*;
data smokercross_fish;
merge newsmokerstats1 fishers(where=(name1='XP2_FISH') keep=name1 cvalue1);
prob=put(cvalue1,5.3);
drop name1 cvalue1;
run;

proc print data=smokercross_fish;
run;

*race descriptive statistics*;
ods output crosstabfreqs=racecross; *Output dataset for crosstabs;
ods output fishersexact=racefishers; *Output dataset for Fisher's Exact test;
proc freq data=studydata2;
tables race*trtgroup/ fisher outpct;
run;

proc print data = racecross;
run;
proc print data= racefishers;
run;

data racecross1;
set racecross;
if _type_='11';
keep race trtgroup frequency colpercent;
run;
*Modify N and Percent to create character formats for PROC REPORT;
data racecross2;
set racecross1;
length n_perc $12;
n=put(frequency,3.);
perc=put(colpercent, 5.1);
n_perc=(n||' ('||perc||'%'||')'); *Variable in form xx (xx.x%);
run;
proc sort data=racecross2;
by race trtgroup;
run;
*Switch orientation;
proc transpose data=racecross2 out=newracestats1;
by race ;
id trtgroup;
var n_perc;
run;
* combine with fisher exact*;
data racecross_fish;
merge newracestats1 fishers(where=(name1='XP2_FISH') keep=name1 cvalue1);
prob=put(cvalue1,5.3);
drop name1 cvalue1;
run;

proc print data=racecross_fish;
run;

*age descriptive statistics*;
ods output crosstabfreqs=agecross; *Output dataset for crosstabs;
ods output fishersexact=agefishers; *Output dataset for Fisher's Exact test;
proc freq data=studydata2;
tables age*trtgroup/ fisher outpct;
run;

proc print data = agecross;
run;
proc print data= agefishers;
run;

data agecross1;
set agecross;
if _type_='11';
keep age trtgroup frequency colpercent;
run;
*Modify N and Percent to create character formats for PROC REPORT;
data agecross2;
set agecross1;
length n_perc $12;
n=put(frequency,3.);
perc=put(colpercent, 5.1);
n_perc=(n||' ('||perc||'%'||')'); *Variable in form xx (xx.x%);
run;
proc sort data=agecross2;
by age trtgroup;
run;
*Switch orientation;
proc transpose data=agecross2 out=newagestats1;
by age ;
id trtgroup;
var n_perc;
run;

* combine with fisher exact*;
data agecross_fish;
merge newagestats1 fishers(where=(name1='XP2_FISH') keep=name1 cvalue1);
prob=put(cvalue1,5.3);
drop name1 cvalue1;
run;

proc print data=agecross_fish;
run;

proc sql;
create table summarytable_3 as
select sex as variable, Placebo, Untreated_no_gel, Low_Dose, Medium_Dose, High_Dose,prob from sexcross_fish
union all
select race, Placebo, Untreated_no_gel, Low_Dose, Medium_Dose, High_Dose,prob from racecross_fish
union all
select smoker, Placebo, Untreated_no_gel, Low_Dose, Medium_Dose, High_Dose,prob from smokercross_fish
union all
select age, Placebo, Untreated_no_gel, Low_Dose, Medium_Dose, High_Dose,prob from agecross_fish;
quit;

proc print data = summarytable_3;
run;

Reeza
Super User

I would recommend this macro

https://communities.sas.com/t5/SAS-Communities-Library/Demographic-Table-and-Subgroup-Summary-Macro-...

 

PROC FORMAT;
VALUE Sex 2='Female' 1='Male';
VALUE Race 5='White' 4= 'Asian' 2 ='African American' 1='Native American';
VALUE Smoker 1='Yes' 0='No' . = 'Unknown';
VALUE Trtgroup 1='Placebo' 2='Untreated/no gel' 3='Low Dose' 4='Medium Dose' 5 ='High Dose';
VALUE Age
. = 'Unknown'
LOW - 45 = '< 45 years'
45 - 55 = '45 - 55 years'
55 - HIGH = '> 55 years';
RUN;
DATA studydata;
INFILE Datalines dlm =",";
INPUT id trtgroup sex race age smoker;
FORMAT sex sex. race race. smoker smoker.
trtgroup trtgroup. age age.;
DATALINES;
100,3,1,5,56.12867899,1
101,5,1,5,53.88090349,1
102,1,1,5,39.29089665,1
103,2,1,5,46.55167693,1
104,3,1,5,69.82340862,1
105,4,1,5,54.48323066,1
106,4,1,5,55.09377139,1
107,5,1,5,53.25941136,1
108,2,2,5,41.01574264,1
109,3,2,2,44.13689254,0
110,1,2,5,30.40383299,0
111,2,2,5,59.07186858,0
112,3,2,5,61.2019165,0
113,5,2,5,64.60232717,0
114,4,2,5,66.86379192,0
;
RUN;

filename tableN url 'https://communities.sas.com/kntur85557/attachments/kntur85557/library/4477/10/tablen_web_20210718.sas';

%include tableN ;

filename tableN;

%tablen(data=studydata, by=trtgroup, 
     var=sex race smoker,
     type=2, pvals=2, outdoc=/home/fkhurshed/Demo1/example1.rtf);

Results:

 

 
 
  trtgroup  
  Placebo
(N=2)
Untreated/no gel
(N=3)
Low Dose
(N=4)
Medium Dose
(N=3)
High Dose
(N=3)
Total
(N=15)
P-value
sex, n (%)             1.00001
Male 1 (50.0%) 1 (33.3%) 2 (50.0%) 2 (66.7%) 2 (66.7%) 8 (53.3%)  
Female 1 (50.0%) 2 (66.7%) 2 (50.0%) 1 (33.3%) 1 (33.3%) 7 (46.7%)  
               
race, n (%)             1.00001
African American 0 (0.0%) 0 (0.0%) 1 (25.0%) 0 (0.0%) 0 (0.0%) 1 (6.7%)  
White 2 (100.0%) 3 (100.0%) 3 (75.0%) 3 (100.0%) 3 (100.0%) 14 (93.3%)  
               
smoker, n (%)             1.00001
No 1 (50.0%) 1 (33.3%) 2 (50.0%) 1 (33.3%) 1 (33.3%) 6 (40.0%)  
Yes 1 (50.0%) 2 (66.7%) 2 (50.0%) 2 (66.7%) 2 (66.7%) 9 (60.0%)  
1Fisher Exact p-value;

 

EDIT: added in the code, and fixed the code because filename references cannot be longer than 8 chars.

 

Reeza
Super User

FYI - if you add labels to your variables that will control the display of the variable names.

Zaichik17
Calcite | Level 5

Wow thank you, that worked exactly like I needed it! 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 832 views
  • 0 likes
  • 5 in conversation