The SAS Output Delivery System and reporting techniques

format regression estimate tables for publishing

Reply
Occasional Contributor
Posts: 19

format regression estimate tables for publishing

I would like to add stars as superscripts of estimated parameters from regressions (e.g. ***, **, and * denotes significance level of 1%, 5% and 10%  respectively). This is for publishing finance research papers.

I would use a p-value to decide how many stars to add. For example, if p-value of X1 is 0.0023, I would add *** whereas if the p-value is 0.089, I would put * on the top right of the estimate. As an example:

year	intercept	X1	X2	X3	intercept_pvalue	X1_pvalue	X2_pvalue	X3_pvalue
2000	4.05***		1.2	5*	9.4**	0.00452			0.2345		0.05412		0.03767
2001	3.12*		1.0	4.4**	8.8	0.0912***		0.1235		0.01231		0.00012

This example only contains 4 parameters (intercept and 3 explanatory variables) but I would like to write a macro that automatically get the p-value for the corresponding parameters (see how there is a suffix "_pvalue" after the variable name) in case there are many variables and the names change all the time.
What I can think of is:

*need to use put function to convert numeric value to characters so that it contains *;
if variable_name+"_pvalue" <= 0.01 then variable= put(variable_name, 7.4) + "***";
if 0.01 <= variable_name+"_pvalue" <=0.05 then variable= put(variable_name, 7.4) + "**";
if 0.05 <= variable_name+"_pvalue" <0.1 then variable= put(variable_name, 7.4) + "*";

But I am not sure how to get a list of variable names. I can get it to a dataset using the following code:

* export the variable names and their position number into a data set called "data_info";
proc contents
     data = sashelp.class noprint out = data_info (keep = name varnum);
run;
* sort "data_info" by "varnum";
* export the sorted data set with the name "variable_names", and keep just the "name" column;
proc sort
     data = data_info out = variable_names(keep = name);
     by varnum;
run;

 

Trusted Advisor
Posts: 1,933

Re: format regression estimate tables for publishing

[ Edited ]
Posted in reply to ducman1611

To the issue of adding asterisks ... I believe you can use the PICTURE command in PROC FORMAT and then you don't need to convert numeric variables to character. In fact, I suspect someone has already done this, and a search might turn up the answer and then you don't have to code it yourself (although it isn't hard to do).

 

To get a list of variable names, you could use PROC SQL to create a macro variable with the list of variable names. Something like

 

proc sql noprint;
    select distinct name into :names separated by ' ' from variable_names;
quit;

Better still, you might want to obtain the list of variable names in the model from the output data set created by PROC REG or PROC GLM or whatever was used to compute the model.

 

Once you have the list of variable names, you assign the PICTURE format to them all, that's pretty simple in a data step or in PROC PRINT/PROC REPORT/PROC whatever.

 

Super User
Posts: 10,046

Re: format regression estimate tables for publishing

Posted in reply to ducman1611

You could make a format.

 

proc format;
picture fmt
low-0.01='9.999***'
0.01<-0.05='9.999**'
0.05<-0.1='9.999*';
run;
data x;
input p;
format p fmt10.;
cards;
0.15
0.02
0.06
0.009
;
run;


Super User
Posts: 19,870

Re: format regression estimate tables for publishing

Posted in reply to ducman1611

Could you create a custom format, and then say you use PROC REPORT to display your final results. In a computed column you can conditionally format the variables based on the P Values. This way you can maintain your data as numbers. Try a PICTURE format for adding the asterisks. In fact, if you search on LexJansen.com you'll probably find sample code for this as it's a common request. 

Ask a Question
Discussion stats
  • 3 replies
  • 347 views
  • 0 likes
  • 4 in conversation