Solved: Re: Creating a table of means with asterisk

Astana · Posted 10-04-2019 01:09 PM

Hello, SAS community!

I have a table with some daily data (Sample_data). 'Date' variable specifies the date. 'Group' denotes the assigned group for each value.

Date	Group 1	Group 2	Group 3	Group 4
10/4/2019	12.00	12.00	12.00	12.00
10/5/2019	12.00	34.00	34.00	34.00
10/6/2019	12.00	43.00	43.00	43.00
10/7/2019	12.00	60.67	60.67	60.67
10/8/2019	32.00	76.17	76.17	76.17
10/9/2019	43.00	91.67	91.67	91.67
10/10/2019	40.49	107.17	107.17	107.17
10/11/2019	46.35	122.67	122.67	122.67
10/12/2019	52.22	138.17	138.17	138.17

I would like to create another table with means and t-statistics in parentheses for each group (1 row, 4 columns) and add to the mean values in the new table stars as superscripts which will identify the significance level(e.g. ***, **, and * denotes significance level of 1%, 5% and 10% respectively).

I kindly ask SAS experts to help me with a code.

Best regards,

Reeza · Posted 10-04-2019 05:29 PM

1. Use PROC MEANS to get the results needed in cells - T, PROBT and MEAN are the stats you want and PROC MEANS shows them

2. Use CATT() to combine into displayed values you want

3. Transpose to a wide format

Untested unfortunately, but should get you started. Note that the proc means creates two output data sets, summary1 and summary2 in different layouts.

I felt the second is more dynamic.

proc means data=have N MEAN T PROBT STACKODS;
var group1-group4;
output out=summary1 n= mean= t= probt= / autoname;
ods output summary=summary2;
run;

data want_long;
set summary2;
length asterisk $3.;
if probt<0.01 then asterisk ='***';
else if probt<0.05 then asterisk = "**";
else if probt<0.1 then asterisk='*';

display = catt(mean, "(", t, ")", asterisk);

run;

*not sure about this step, flip to wide format;
proc transpose data=want_long out=want_wide ;
id variable;
var display;
run;

@Astana wrote:

Hello, SAS community!

I have a table with some daily data (Sample_data). 'Date' variable specifies the date. 'Group' denotes the assigned group for each value.

Date Group 1 Group 2 Group 3 Group 4

10/4/2019 12.00 12.00 12.00 12.00

10/5/2019 12.00 34.00 34.00 34.00

10/6/2019 12.00 43.00 43.00 43.00

10/7/2019 12.00 60.67 60.67 60.67

10/8/2019 32.00 76.17 76.17 76.17

10/9/2019 43.00 91.67 91.67 91.67

10/10/2019 40.49 107.17 107.17 107.17

10/11/2019 46.35 122.67 122.67 122.67

10/12/2019 52.22 138.17 138.17 138.17

I would like to create another table with means and t-statistics in parentheses for each group (1 row, 4 columns) and add to the mean values in the new table stars as superscripts which will identify the significance level(e.g. ***, **, and * denotes significance level of 1%, 5% and 10% respectively).

I kindly ask SAS experts to help me with a code.

Best regards,

View solution in original post

Reeza · Posted 10-04-2019 01:11 PM

What tests determine the significance levels? Can you show the expected output based on this data?

Astana · Posted 10-04-2019 02:04 PM

Thank you, Reeza.

The significance level is determined by whether p-value is smaller then the specified significance level.

P-value is calculated from defining t-stat (mean/st.error) and degrees of freedom (# of observations - 1)

Here is the desired output (significance levels don't reflect the actual one)

	Group 1	Group 2	Group 3	Group 4
Mean	29.12(7.05)*	76.17(11.69)**	76.17(11.69)**	76.17(11.69)**

Reeza · Posted 10-04-2019 03:31 PM

What is the t-test? Against the other groups? Against a known value? If you're testing against the other groups are you doing that one by one? If so, are you correcting for multiple testing?

Astana · Posted 10-04-2019 04:44 PM

T-stat for the group we calculate using the mean of the same group and divide by st.error of the same group.

Degrees of freedom are calculated by subtracting 1 from the number of observations in the group.

Using t-stat and degrees of freedom as inputs we can calculate the p-value.

Excel formula for this is T.DIST.RT(tstat, df)

Comparing p value with significance level (0.01, 0.05, 0.10) we can conclude whether the mean value is different from zero.

Example: if we find p-value equal to 0.08 then mean value is different from zero with 0.10 significance level. So we assign one asterisk.

so the result will be: mean_value(t_stat)*

Reeza · Posted 10-04-2019 04:47 PM

So you're testing if the mean values are different than 0?

Astana · Posted 10-04-2019 04:49 PM

Yes. This is what I am trying to get.

Reeza · Posted 10-04-2019 05:29 PM

1. Use PROC MEANS to get the results needed in cells - T, PROBT and MEAN are the stats you want and PROC MEANS shows them

2. Use CATT() to combine into displayed values you want

3. Transpose to a wide format

Untested unfortunately, but should get you started. Note that the proc means creates two output data sets, summary1 and summary2 in different layouts.

I felt the second is more dynamic.

proc means data=have N MEAN T PROBT STACKODS;
var group1-group4;
output out=summary1 n= mean= t= probt= / autoname;
ods output summary=summary2;
run;

data want_long;
set summary2;
length asterisk $3.;
if probt<0.01 then asterisk ='***';
else if probt<0.05 then asterisk = "**";
else if probt<0.1 then asterisk='*';

display = catt(mean, "(", t, ")", asterisk);

run;

*not sure about this step, flip to wide format;
proc transpose data=want_long out=want_wide ;
id variable;
var display;
run;

@Astana wrote:

Hello, SAS community!

I have a table with some daily data (Sample_data). 'Date' variable specifies the date. 'Group' denotes the assigned group for each value.

Date Group 1 Group 2 Group 3 Group 4

10/4/2019 12.00 12.00 12.00 12.00

10/5/2019 12.00 34.00 34.00 34.00

10/6/2019 12.00 43.00 43.00 43.00

10/7/2019 12.00 60.67 60.67 60.67

10/8/2019 32.00 76.17 76.17 76.17

10/9/2019 43.00 91.67 91.67 91.67

10/10/2019 40.49 107.17 107.17 107.17

10/11/2019 46.35 122.67 122.67 122.67

10/12/2019 52.22 138.17 138.17 138.17

I would like to create another table with means and t-statistics in parentheses for each group (1 row, 4 columns) and add to the mean values in the new table stars as superscripts which will identify the significance level(e.g. ***, **, and * denotes significance level of 1%, 5% and 10% respectively).

I kindly ask SAS experts to help me with a code.

Best regards,

Astana · Posted 10-09-2019 07:50 PM

Thank you! That works perfectly!

Is there any mean I can make the final output values rounded to 4-5 values after the decimal point?

Reeza · Posted 10-09-2019 09:49 PM

Replace mean with put(mean, 12.4) in the CATT function.

Astana · Posted 10-10-2019 10:30 AM

Thank you!

SAS Innovate 2025: Save the Date

SAS Training: Just a Click Away