I am trying to report the results of my regression analysis in a Journal quality format, the usual format that is used to show the results of regressions should something like this, with estimates for different models in separate columns, significance level reported by * and stddev reported in the parentheses below each estimation :
The results of the regression are not reported like this the SAS output, so I combine my results from different output tables and get a dataset like this:
I used proc report to put the output in publication quality format, that is my code:
ODS ESCAPECHAR='^'; ods pdf file='inddind_drugNEW.pdf' style=journal2 notoc; proc report data=combinedrugY( rename=(stderr=std1 ProbChiSq =pb1)) out=temptbl ; column variable model ,(pb1 estimate std1); define variable/group order=data ' '; define model/ across ' '; define pb1 /noprint sum ; define estimate /analysis sum ' '; define std1 /analysis sum format=stderrf. ' ' ; compute estimate ; array cc _c11_ _c8_ _c5_ _c2_; do i= 1 to 4 ; if ~missing(cc(i)) then do; if 0.05<cc(i) <= 0.1 then call define(_col_, "style", "style=[ posttext='*']" ); else if 0.01 <cc(i) <=0.05 then call define(_col_, "style", "style=[ posttext='**']" ); else if cc(i) <= 0.01 then call define(_col_, "style", "style=[ posttext='***']" ); leave;end; end; endcomp; run; ods text="^S={width=100% just=c } NOTE: Logit regression for probability of an individual diagnosed with abuse or addiction of different substances, all specifications control for year and state and fixed effects. ^{newline} ^S={just=l }*** Significant at the 1 percent level. ^{newline} ** Significant at the 5 percent level. ^{newline} * Significant at the 10 percent level. ^{newline} " ; ods pdf close;title;footnote;
The output looks like this:
As you can see, it is still is not in the format I was hopping. It would be great if someone would help me to fix the table so that:
1) put the stdd1 under the estimate variable instead of beside it.
2) Adjust the Note text such that it starts at the point that table starts instead of the start of the line.
Thank you.
Can you provide some sample data to help work with this?
You can also search lexjansen.com for 'clinical reporting' to see many samples.
I am also very interested in automating the process of reporting a publication ready table of regression results of the type described. I thank zzecon for the code that helped me to start the process.
I am getting closer, but still not there.
the table I start with (attached) is one that append parameterestimates, fit and nobs tables from proc reg results .
PROC IMPORT OUT= WORK.PARM
DATAFILE= "G:\temp\parm.xls"
DBMS=EXCEL REPLACE;
RANGE="parm";
GETNAMES=YES;
MIXED=NO;
SCANTEXT=YES;
USEDATE=YES;
SCANTIME=YES;
RUN;
/*write a dataset with standard error on the same column of coefficient*/
data parm2;
set temp.parm;
if not missing(stderr) and variable not in ('R-square','Adj.R-sq','N. Obs.') then do;
value=estimate; type='coefficient'; output; end;
if not missing(stderr) then do; value=stderr; type='stderr' ;output; end;
if variable in ('R-square','Adj.R-sq','N. Obs.') then do; value=estimate; type=variable ;output; end;
run;
proc format;
picture stderrf (round)
low-high=' 9.9999)' (prefix='(')
.=' ';
run;
ods html close;
ods html;
title;
proc report data=parm2 nowd out=temptbl;
*column model numord variable type dependent, value;
column numord variable type dependent, (value probt);
define numord /group order=data noprint;
define variable / group order=data ' ';
define type / group order=data noprint;
define dependent / across ' ';
define value /analysis sum;
define probt /analysis sum;
compute value;
array cc _c5_ _c7_ _c9_ ;
if type='stderr' then do;
call define(_col_,'format','stderrf.');
end;
else if type='coefficient' then do;
call define(_col_,'format','8.4');
do i=1 to 3;
if 0.05<cc(i) <= 0.1 then call define(_col_, "style", "style=[ posttext='*']" );
else if 0.01 <cc(i) <=0.05 then call define(_col_, "style", "style=[ posttext='**']" );
else if cc(i) <= 0.01 then call define(_col_, "style", "style=[ posttext='***']" );
end;
end;
endcomp;
run;
SAS Output
Y1 | Y2 | Y3 | ||||
---|---|---|---|---|---|---|
value | Pr > |t| | value | Pr > |t| | value | Pr > |t| | |
Intercept | 0.2788*** | <.0001 | 0.7622*** | <.0001 | -1.4992*** | <.0001 |
(0.0097) | <.0001 | (0.0067) | <.0001 | (0.0191) | <.0001 | |
X1 | -0.0313*** | <.0001 | -0.0800*** | <.0001 | -0.2424*** | <.0001 |
(0.0009) | <.0001 | (0.0006) | <.0001 | (0.0018) | <.0001 | |
X2 | 0.0137*** | <.0001 | 0.0172*** | <.0001 | 0.0048*** | <.0001 |
(0.0003) | <.0001 | (0.0002) | <.0001 | (0.0006) | <.0001 | |
X3 | -0.0001*** | <.0001 | -0.0002*** | 0.0300 | 0.0001*** | <.0001 |
(0.0000) | <.0001 | (0.0000) | 0.0300 | (0.0000) | <.0001 | |
X4 | -0.0190*** | <.0001 | -0.0219*** | <.0001 | -0.0892*** | <.0001 |
(0.0009) | <.0001 | (0.0006) | <.0001 | (0.0018) | <.0001 | |
X5 | 0.0228*** | <.0001 | 0.0590*** | <.0001 | 0.2793*** | 0.1500 |
(0.0013) | <.0001 | (0.0009) | <.0001 | (0.0025) | 0.1500 | |
X6 | -0.0942*** | <.0001 | -0.0322*** | <.0001 | 0.0745*** | <.0001 |
(0.0021) | <.0001 | (0.0014) | <.0001 | (0.0040) | <.0001 | |
R-square | 0.2203 | . | 0.4373 | . | 0.3009 | . |
Adj.R-sq | 0.2202 | . | 0.4373 | . | 0.3008 | . |
N. Obs. | 132000 | . | 132000 | . | 132000 | . |
In the report I have left the probt column printed to show that stars close to the coefficient are not added correctly .
What is wrong?
Any help to solve this problem and clean the aspect of the table is appreciated.
Thank you very much in advance
attached file
I think the reason that you don't have correct stars is that proc report performs the calculations from left to right, so you need to have the column of p values first, then you need to make sure that you don't go through the loop for all values of c(i) , that is why I have the leave command at the end of the loop, and I do it from left to right.
I have already used whatever I could find online to improve the table.
Thanks. In my code, the number of asterisks does not change depending on the pvalue. can you tell me where is the error?
As you can see I start the loop of comparing from left to right, to make sure that I only compare with the most recent value printed, and then leave the loop. I guess it is not the most efficient way but it works.
array cc _c11_ _c8_ _c5_ _c2_; do i= 1 to 4 ; if ~missing(cc(i)) then do; if 0.05<cc(i) <= 0.1 then call define(_col_, "style", "style=[ posttext='*']" ); else if 0.01 <cc(i) <=0.05 then call define(_col_, "style", "style=[ posttext='**']" ); else if cc(i) <= 0.01 then call define(_col_, "style", "style=[ posttext='***']" ); leave;end; end;
Can you post your final code, so I can use it as well? Thanks
should be the following, hoping is the last example code.
/*write a dataset with standard error on the same column of coefficient*/
data parm2;
set temp.parm;
if not missing(stderr) and variable not in ('R-square','Adj.R-sq','N. Obs.') then do;
value=estimate; type='coefficient'; output; end;
if not missing(stderr) then do; value=stderr; type='stderr' ;output; end;
if variable in ('R-square','Adj.R-sq','N. Obs.') then do; value=estimate; type=variable ;output; end;
run;
proc format;
picture stderrf (round)
low-high=' 9.9999)' (prefix='(')
.=' ';
run;
ods html close;
ods html;
title;
proc report data=parm2 nowd out=temptbl;
column numord variable type dependent, (probt value);
define numord /group order=data noprint;
define variable / group order=data ' ';
define type / group order=data noprint;
define dependent / across ' ';
define value /analysis sum;
define probt /analysis sum;
compute value;
array cc _c8_ _c6_ _c4_ ;
if type='stderr' then do;
call define(_col_,'format','stderrf.');
end;
else do;
call define(_col_,'format','8.4');
do i= 1 to 3 ;
if ~missing(cc(i)) then do;
if 0.05<cc(i) <= 0.1 then call define(_col_, "style", "style=[ posttext='*']" );
else if 0.01 <cc(i) <=0.05 then call define(_col_, "style", "style=[ posttext='**']" );
else if cc(i) <= 0.01 then call define(_col_, "style", "style=[ posttext='***']" );
leave;end;
end;
end;
endcomp;
run;
Vey interesting topic,
Thanks for sharing the code. I try to merge everything, but doesn't run properly. Is it possible to to post an acutal running code with the dataset ?
For example in the first proc impot, theres a "set temp.parm;"
Thanks in advance!!
Philippe
This code is really helpful! Where do the variables specified in your array (_c11_ _c8_ _c5_ _c2) come from? I'm not clear how I should adapt those varnames for my parameter estimates dataset..
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Lock in the best rate now before the price increases on April 1.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.