Using Proc Surveyfreq to create column totals/percentage within subgro...

Wolverine

I'm working on a task that is a continuation of my previous post. But now I need to change the output table so that the percentages within each column (for each analysis variable) sum to 100%.

Below is sample data, and the code that I have so far. PUFFWGT is the person weight, while PUFF001 is a replicate weight to account for effects of the sampling/survey design. There are actually 100 replicate weights, but I only included 2 as an example. TM_group and ADRD_group are my subgroups, DEM_AGE is one of the analysis variables, and cohort_flag indicates cases that are included in the analysis. I'm only interested in cases where cohort_flag=1, but I can't delete cases where cohort_flag=0 because this is weighted sample data.

data have;
infile datalines dsd dlm=',' truncover;
input TM_group ADRD_group DEM_AGE cohort_flag PUFFWGT PUFF001-PUFF002;
datalines;
0,0,2,1,2110.5357,676.0632354,3108.917787
0,1,3,1,918.4676314,1106.627216,11398.01208
0,0,3,1,2758.096955,4735.215718,1418.690422
0,1,2,1,2456.208457,918.6245816,4548.556261
0,1,2,1,616.4292264,1033.661611,1213.613946
0,0,2,1,3673.275956,6316.504889,1704.047563
0,1,3,1,1102.850296,304.9154776,904.2769054
0,0,3,1,2714.610724,877.3159168,993.7824266
0,1,2,1,20090.52479,6085.947656,7590.092351
1,1,3,1,9098.780523,3498.802425,1335.317433
1,1,2,0,7196.92957,11977.46854,1240.066816
1,1,3,1,842.3146743,1255.501335,4314.886534
1,0,2,0,13516.51008,4775.164303,1106.604846
1,0,2,0,4460.828162,1083.99043,943.9937287
1,0,3,0,4818.845124,9286.932256,8197.376118
1,0,2,1,2848.023551,4739.321046,1113.329766
1,.,3,0,3050.820193,805.9607077,761.0495075
1,.,2,1,4256.565135,1176.233283,1775.665732
1,1,2,1,5211.416525,1570.093954,291.3352611
1,1,3,0,4181.422692,8242.459467,2573.543707
1,1,3,1,3179.943422,4688.70721,6518.281019
.,.,2,1,3451.708714,4256.625072,1167.477312
.,.,3,1,3018.745468,4729.150346,1565.75873
.,.,3,1,4098.775,1486.742512,1283.25
.,.,3,1,3483.708151,1098.980881,1349.598
;RUN;
 
/*Section 0.5 -- Sort analysis file on BY variables, as required by PROC SURVEYFREQ*/
PROC SORT data=have;
      BY cohort_flag; RUN;
 
 
/***  SECTION 1 -- Examine weighted freqs for each group      ***/
 
/*Section 1.1.2 -- Generate weighted freqs*/
title1 "Section 1.1.2 -- Weighted freqs - both person and sample weights V&version.";
PROC SURVEYFREQ data=have missing;
      WEIGHT PUFFWGT;
      REPWEIGHTS PUFF001-PUFF002;
      BY cohort_flag;
      TABLES TM_group*ADRD_group*DEM_AGE / nosparse nowt ;
RUN;

Here is one of the output tables, where cohort_flag=1. Note that when ADRD_group=1, DEM_AGE has freqs where '2'=3 and '3'=2, for a total of 5. That's correct, but the percents are roughly 63% and 5%, when they should be 60% and 40% so that they sum to 100%. The total row should read "Total 5 100"

Table of ADRD_group by DEM_AGE
Controlling for TM_group=0
ADRD_group	DEM_AGE	Frequency	Percent	Std Err of
ADRD_group	DEM_AGE	Frequency	Percent	Percent
0	2	2	15.8717	11.23
	3	2	15.018	9.1592
	Total	4	30.8897	19.6118
1	2	3	63.5635	25.122
	3	2	5.5468	22.5428
	Total	5	69.1103	19.6118
Total	2	5	79.4352	18.873
	3	4	20.5648	18.873
	Total	9	100

ballardw

If you want a Row percent total in the table then add the ROW option to the Tables statement.

If you want a Column percent total in the table then ad the COLUMN option to the Tables statement.

Otherwise the result is the percent of the TABLE only.

Using Proc Surveyfreq to create column totals/percentage within subgroups

Re: Using Proc Surveyfreq to create column totals/percentage within subgroups

Using Proc Surveyfreq to create column totals/percentage within subgroups

Re: Using Proc Surveyfreq to create column totals/percentage within subgroups

Click image to register for webinar

Classroom Training Available!