I'm working on a task that is a continuation of my previous post. But now I need to change the output table so that the percentages within each column (for each analysis variable) sum to 100%.
Below is sample data, and the code that I have so far. PUFFWGT is the person weight, while PUFF001 is a replicate weight to account for effects of the sampling/survey design. There are actually 100 replicate weights, but I only included 2 as an example. TM_group and ADRD_group are my subgroups, DEM_AGE is one of the analysis variables, and cohort_flag indicates cases that are included in the analysis. I'm only interested in cases where cohort_flag=1, but I can't delete cases where cohort_flag=0 because this is weighted sample data.
data have;
infile datalines dsd dlm=',' truncover;
input TM_group ADRD_group DEM_AGE cohort_flag PUFFWGT PUFF001-PUFF002;
datalines;
0,0,2,1,2110.5357,676.0632354,3108.917787
0,1,3,1,918.4676314,1106.627216,11398.01208
0,0,3,1,2758.096955,4735.215718,1418.690422
0,1,2,1,2456.208457,918.6245816,4548.556261
0,1,2,1,616.4292264,1033.661611,1213.613946
0,0,2,1,3673.275956,6316.504889,1704.047563
0,1,3,1,1102.850296,304.9154776,904.2769054
0,0,3,1,2714.610724,877.3159168,993.7824266
0,1,2,1,20090.52479,6085.947656,7590.092351
1,1,3,1,9098.780523,3498.802425,1335.317433
1,1,2,0,7196.92957,11977.46854,1240.066816
1,1,3,1,842.3146743,1255.501335,4314.886534
1,0,2,0,13516.51008,4775.164303,1106.604846
1,0,2,0,4460.828162,1083.99043,943.9937287
1,0,3,0,4818.845124,9286.932256,8197.376118
1,0,2,1,2848.023551,4739.321046,1113.329766
1,.,3,0,3050.820193,805.9607077,761.0495075
1,.,2,1,4256.565135,1176.233283,1775.665732
1,1,2,1,5211.416525,1570.093954,291.3352611
1,1,3,0,4181.422692,8242.459467,2573.543707
1,1,3,1,3179.943422,4688.70721,6518.281019
.,.,2,1,3451.708714,4256.625072,1167.477312
.,.,3,1,3018.745468,4729.150346,1565.75873
.,.,3,1,4098.775,1486.742512,1283.25
.,.,3,1,3483.708151,1098.980881,1349.598
;RUN;
/*Section 0.5 -- Sort analysis file on BY variables, as required by PROC SURVEYFREQ*/
PROC SORT data=have;
BY cohort_flag; RUN;
/*** SECTION 1 -- Examine weighted freqs for each group ***/
/*Section 1.1.2 -- Generate weighted freqs*/
title1 "Section 1.1.2 -- Weighted freqs - both person and sample weights V&version.";
PROC SURVEYFREQ data=have missing;
WEIGHT PUFFWGT;
REPWEIGHTS PUFF001-PUFF002;
BY cohort_flag;
TABLES TM_group*ADRD_group*DEM_AGE / nosparse nowt ;
RUN;
Here is one of the output tables, where cohort_flag=1. Note that when ADRD_group=1, DEM_AGE has freqs where '2'=3 and '3'=2, for a total of 5. That's correct, but the percents are roughly 63% and 5%, when they should be 60% and 40% so that they sum to 100%. The total row should read "Total 5 100"
| Table of ADRD_group by DEM_AGE | ||||
| Controlling for TM_group=0 | ||||
| ADRD_group | DEM_AGE | Frequency | Percent | Std Err of |
| Percent | ||||
| 0 | 2 | 2 | 15.8717 | 11.23 |
| 3 | 2 | 15.018 | 9.1592 | |
| Total | 4 | 30.8897 | 19.6118 | |
| 1 | 2 | 3 | 63.5635 | 25.122 |
| 3 | 2 | 5.5468 | 22.5428 | |
| Total | 5 | 69.1103 | 19.6118 | |
| Total | 2 | 5 | 79.4352 | 18.873 |
| 3 | 4 | 20.5648 | 18.873 | |
| Total | 9 | 100 | ||
If you want a Row percent total in the table then add the ROW option to the Tables statement.
If you want a Column percent total in the table then ad the COLUMN option to the Tables statement.
Otherwise the result is the percent of the TABLE only.
If you want a Row percent total in the table then add the ROW option to the Tables statement.
If you want a Column percent total in the table then ad the COLUMN option to the Tables statement.
Otherwise the result is the percent of the TABLE only.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.