Contributor
Posts: 61

# format regression estimate tables for publishing

I would like to add stars as superscripts of estimated parameters from regressions (e.g. ***, **, and * denotes significance level of 1%, 5% and 10%  respectively). This is for publishing finance research papers.

I would use a p-value to decide how many stars to add. For example, if p-value of X1 is 0.0023, I would add *** whereas if the p-value is 0.089, I would put * on the top right of the estimate. As an example:

```year	intercept	X1	X2	X3	intercept_pvalue	X1_pvalue	X2_pvalue	X3_pvalue
2000	4.05***		1.2	5*	9.4**	0.00452			0.2345		0.05412		0.03767
2001	3.12*		1.0	4.4**	8.8	0.0912***		0.1235		0.01231		0.00012```

This example only contains 4 parameters (intercept and 3 explanatory variables) but I would like to write a macro that automatically get the p-value for the corresponding parameters (see how there is a suffix "_pvalue" after the variable name) in case there are many variables and the names change all the time.
What I can think of is:

```*need to use put function to convert numeric value to characters so that it contains *;
if variable_name+"_pvalue" <= 0.01 then variable= put(variable_name, 7.4) + "***";
if 0.01 <= variable_name+"_pvalue" <=0.05 then variable= put(variable_name, 7.4) + "**";
if 0.05 <= variable_name+"_pvalue" <0.1 then variable= put(variable_name, 7.4) + "*";```

But I am not sure how to get a list of variable names. I can get it to a dataset using the following code:

``````* export the variable names and their position number into a data set called "data_info";
proc contents
data = sashelp.class noprint out = data_info (keep = name varnum);
run;
* sort "data_info" by "varnum";
* export the sorted data set with the name "variable_names", and keep just the "name" column;
proc sort
data = data_info out = variable_names(keep = name);
by varnum;
run;``````

Posts: 3,066

## Re: format regression estimate tables for publishing

[ Edited ]

To the issue of adding asterisks ... I believe you can use the PICTURE command in PROC FORMAT and then you don't need to convert numeric variables to character. In fact, I suspect someone has already done this, and a search might turn up the answer and then you don't have to code it yourself (although it isn't hard to do).

To get a list of variable names, you could use PROC SQL to create a macro variable with the list of variable names. Something like

```proc sql noprint;
select distinct name into :names separated by ' ' from variable_names;
quit;
```

Better still, you might want to obtain the list of variable names in the model from the output data set created by PROC REG or PROC GLM or whatever was used to compute the model.

Once you have the list of variable names, you assign the PICTURE format to them all, that's pretty simple in a data step or in PROC PRINT/PROC REPORT/PROC whatever.

--
Paige Miller
Super User
Posts: 10,787

## Re: format regression estimate tables for publishing

You could make a format.

``````proc format;
picture fmt
low-0.01='9.999***'
0.01<-0.05='9.999**'
0.05<-0.1='9.999*';
run;
data x;
input p;
format p fmt10.;
cards;
0.15
0.02
0.06
0.009
;
run;

``````
Super User
Posts: 23,778