BookmarkSubscribeRSS Feed
Zaichik17
Calcite | Level 5

I am creating a table for 4 variables : Dem_Incomebelowpoverty, Dem_pctn_AfricanAmerican, Out_County_obesity, and Risk_pct_yes. Here is part of the data, there are a total of 64 observations.

Zaichik17_0-1639425353001.png

I want to have these 4 variables and then the Mean +/- SD as one column and Median as the second column,

so after Proc Tabulate it looks like this:

 

ods output Table=SummaryTable1_pre;

PROC TABULATE DATA = CoAnl.combined_analysis MISSING;
VAR Dem_IncomeBelowPoverty Dem_Pctn_Africanamerican Out_County_obesity Risk_pct_yes ;
TABLE Dem_IncomeBelowPoverty Dem_Pctn_Africanamerican Out_County_obesity Risk_pct_yes, mean*f=percent7.1 std*f=percent7.1 median*f=percent7.1/nocellmerge;
RUN;

This is the result:

Zaichik17_0-1639428421752.png

Next I need to combine Mean and Std to show as Mean +/- SD, so I'm using a data step and need to use proc transpose as this creates wide data, however I can't figure out how to keep Median in there? I want the final table to look like above but with Mean and Std combined.

DATA SummaryTable_pre2;
SET SummaryTable1_pre;
Dem_IncomeBelowPoverty=STRIP(PUT(Dem_IncomeBelowPoverty_Mean,8.2))||"^{Unicode 00B1}"||STRIP(PUT(Dem_IncomeBelowPoverty_Std,8.2));
Dem_Pctn_Africanamerican=STRIP(PUT(Dem_Pctn_Africanamerican_Mean,8.2))||"^{Unicode 00B1}"||STRIP(PUT(Dem_Pctn_Africanamerican_Std,8.2));
Out_County_obesity=STRIP(PUT(Out_County_obesity_Mean,8.2))||"^{Unicode 00B1}"||STRIP(PUT(Out_County_obesity_Std,8.2));
Risk_pct_yes=STRIP(PUT(Risk_pct_yes_Mean,8.2))||"^{Unicode 00B1}"||STRIP(PUT(Risk_pct_yes_Std,8.2));
RUN;

 

After I combined Mean and std to have a plus/minus sign, I used proc transpose:

PROC TRANSPOSE
DATA=SummaryTable1_pre2
OUT=SummaryTable_clean (RENAME=(_NAME_= Variable Col1=MeanSD));
VAR Dem_IncomeBelowPoverty Dem_Pctn_Africanamerican Out_County_obesity Risk_pct_yes Dem_Pctn_Africanamerican_Median Dem_IncomeBelowPoverty_Median Out_County_obesity_Median Risk_pct_yes_Median ;
RUN;

 

I'm not sure what to do to the Dem_Pctn_Africanamerican_Median Dem_IncomeBelowPoverty_Median Out_County_obesity_Median Risk_pct_yes_Median variables to make a variable named Median and make it transposed onto the 4 variables I want to keep.

Zaichik17_0-1639429206431.png

 

 

Thanks!

2 REPLIES 2
Reeza
Super User
Instead of PROC TABULATE to calculate the statistics, I would suggest using PROC MEANS as you can control the output better and avoid the transpose step.
ballardw
Super User

First thing about dealing with percentages: If your samples for each percentage are not the exact same then "mean" of a percentage is a misnomer at best and an outright lie at worst. Consider if a percentage of 15% is based on a population of 10,000 (1500/10000) and a second percentage of 13% is based on a population of 100 (13/100) when you average 15 and 13 you get 14. But the true percentage of the two groups combined is (1500 +13)/(10000+100) = 14.9802 %(rounded to 4 decimal places.

 

You may be better off with the data before you calculated means to get some of those summaries.