a month ago
I would like to add stars as superscripts of estimated parameters from regressions (e.g. ***, **, and * denotes significance level of 1%, 5% and 10% respectively). This is for publishing finance research papers.
I would use a p-value to decide how many stars to add. For example, if p-value of X1 is 0.0023, I would add *** whereas if the p-value is 0.089, I would put * on the top right of the estimate. As an example:
year intercept X1 X2 X3 intercept_pvalue X1_pvalue X2_pvalue X3_pvalue 2000 4.05*** 1.2 5* 9.4** 0.00452 0.2345 0.05412 0.03767 2001 3.12* 1.0 4.4** 8.8 0.0912*** 0.1235 0.01231 0.00012
This example only contains 4 parameters (intercept and 3 explanatory variables) but I would like to write a macro that automatically get the p-value for the corresponding parameters (see how there is a suffix "_pvalue" after the variable name) in case there are many variables and the names change all the time.
What I can think of is:
*need to use put function to convert numeric value to characters so that it contains *; if variable_name+"_pvalue" <= 0.01 then variable= put(variable_name, 7.4) + "***"; if 0.01 <= variable_name+"_pvalue" <=0.05 then variable= put(variable_name, 7.4) + "**"; if 0.05 <= variable_name+"_pvalue" <0.1 then variable= put(variable_name, 7.4) + "*";
But I am not sure how to get a list of variable names. I can get it to a dataset using the following code:
* export the variable names and their position number into a data set called "data_info"; proc contents data = sashelp.class noprint out = data_info (keep = name varnum); run; * sort "data_info" by "varnum"; * export the sorted data set with the name "variable_names", and keep just the "name" column; proc sort data = data_info out = variable_names(keep = name); by varnum; run;
a month ago - last edited a month ago
To the issue of adding asterisks ... I believe you can use the PICTURE command in PROC FORMAT and then you don't need to convert numeric variables to character. In fact, I suspect someone has already done this, and a search might turn up the answer and then you don't have to code it yourself (although it isn't hard to do).
To get a list of variable names, you could use PROC SQL to create a macro variable with the list of variable names. Something like
proc sql noprint; select distinct name into :names separated by ' ' from variable_names; quit;
Better still, you might want to obtain the list of variable names in the model from the output data set created by PROC REG or PROC GLM or whatever was used to compute the model.
Once you have the list of variable names, you assign the PICTURE format to them all, that's pretty simple in a data step or in PROC PRINT/PROC REPORT/PROC whatever.
a month ago
You could make a format.
proc format; picture fmt low-0.01='9.999***' 0.01<-0.05='9.999**' 0.05<-0.1='9.999*'; run; data x; input p; format p fmt10.; cards; 0.15 0.02 0.06 0.009 ; run;
a month ago
Could you create a custom format, and then say you use PROC REPORT to display your final results. In a computed column you can conditionally format the variables based on the P Values. This way you can maintain your data as numbers. Try a PICTURE format for adding the asterisks. In fact, if you search on LexJansen.com you'll probably find sample code for this as it's a common request.