I'm working on a task that is a continuation of my previous post. But now I need to change the output table so that the percentages within each column (for each analysis variable) sum to 100%.
Below is sample data, and the code that I have so far. PUFFWGT is the person weight, while PUFF001 is a replicate weight to account for effects of the sampling/survey design. There are actually 100 replicate weights, but I only included 2 as an example. TM_group and ADRD_group are my subgroups, DEM_AGE is one of the analysis variables, and cohort_flag indicates cases that are included in the analysis. I'm only interested in cases where cohort_flag=1, but I can't delete cases where cohort_flag=0 because this is weighted sample data.
data have;
infile datalines dsd dlm=',' truncover;
input TM_group ADRD_group DEM_AGE cohort_flag PUFFWGT PUFF001-PUFF002;
datalines;
0,0,2,1,2110.5357,676.0632354,3108.917787
0,1,3,1,918.4676314,1106.627216,11398.01208
0,0,3,1,2758.096955,4735.215718,1418.690422
0,1,2,1,2456.208457,918.6245816,4548.556261
0,1,2,1,616.4292264,1033.661611,1213.613946
0,0,2,1,3673.275956,6316.504889,1704.047563
0,1,3,1,1102.850296,304.9154776,904.2769054
0,0,3,1,2714.610724,877.3159168,993.7824266
0,1,2,1,20090.52479,6085.947656,7590.092351
1,1,3,1,9098.780523,3498.802425,1335.317433
1,1,2,0,7196.92957,11977.46854,1240.066816
1,1,3,1,842.3146743,1255.501335,4314.886534
1,0,2,0,13516.51008,4775.164303,1106.604846
1,0,2,0,4460.828162,1083.99043,943.9937287
1,0,3,0,4818.845124,9286.932256,8197.376118
1,0,2,1,2848.023551,4739.321046,1113.329766
1,.,3,0,3050.820193,805.9607077,761.0495075
1,.,2,1,4256.565135,1176.233283,1775.665732
1,1,2,1,5211.416525,1570.093954,291.3352611
1,1,3,0,4181.422692,8242.459467,2573.543707
1,1,3,1,3179.943422,4688.70721,6518.281019
.,.,2,1,3451.708714,4256.625072,1167.477312
.,.,3,1,3018.745468,4729.150346,1565.75873
.,.,3,1,4098.775,1486.742512,1283.25
.,.,3,1,3483.708151,1098.980881,1349.598
;RUN;
/*Section 0.5 -- Sort analysis file on BY variables, as required by PROC SURVEYFREQ*/
PROC SORT data=have;
BY cohort_flag; RUN;
/*** SECTION 1 -- Examine weighted freqs for each group ***/
/*Section 1.1.2 -- Generate weighted freqs*/
title1 "Section 1.1.2 -- Weighted freqs - both person and sample weights V&version.";
PROC SURVEYFREQ data=have missing;
WEIGHT PUFFWGT;
REPWEIGHTS PUFF001-PUFF002;
BY cohort_flag;
TABLES TM_group*ADRD_group*DEM_AGE / nosparse nowt ;
RUN;
Here is one of the output tables, where cohort_flag=1. Note that when ADRD_group=1, DEM_AGE has freqs where '2'=3 and '3'=2, for a total of 5. That's correct, but the percents are roughly 63% and 5%, when they should be 60% and 40% so that they sum to 100%. The total row should read "Total 5 100"
Table of ADRD_group by DEM_AGE | ||||
Controlling for TM_group=0 | ||||
ADRD_group | DEM_AGE | Frequency | Percent | Std Err of |
Percent | ||||
0 | 2 | 2 | 15.8717 | 11.23 |
3 | 2 | 15.018 | 9.1592 | |
Total | 4 | 30.8897 | 19.6118 | |
1 | 2 | 3 | 63.5635 | 25.122 |
3 | 2 | 5.5468 | 22.5428 | |
Total | 5 | 69.1103 | 19.6118 | |
Total | 2 | 5 | 79.4352 | 18.873 |
3 | 4 | 20.5648 | 18.873 | |
Total | 9 | 100 |
If you want a Row percent total in the table then add the ROW option to the Tables statement.
If you want a Column percent total in the table then ad the COLUMN option to the Tables statement.
Otherwise the result is the percent of the TABLE only.
If you want a Row percent total in the table then add the ROW option to the Tables statement.
If you want a Column percent total in the table then ad the COLUMN option to the Tables statement.
Otherwise the result is the percent of the TABLE only.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Lock in the best rate now before the price increases on April 1.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.