I'm working on a task that is a continuation of my previous post. But now I need to change the output table so that the percentages within each column (for each analysis variable) sum to 100%.
Below is sample data, and the code that I have so far. PUFFWGT is the person weight, while PUFF001 is a replicate weight to account for effects of the sampling/survey design. There are actually 100 replicate weights, but I only included 2 as an example. TM_group and ADRD_group are my subgroups, DEM_AGE is one of the analysis variables, and cohort_flag indicates cases that are included in the analysis. I'm only interested in cases where cohort_flag=1, but I can't delete cases where cohort_flag=0 because this is weighted sample data.
data have;
infile datalines dsd dlm=',' truncover;
input TM_group ADRD_group DEM_AGE cohort_flag PUFFWGT PUFF001-PUFF002;
datalines;
0,0,2,1,2110.5357,676.0632354,3108.917787
0,1,3,1,918.4676314,1106.627216,11398.01208
0,0,3,1,2758.096955,4735.215718,1418.690422
0,1,2,1,2456.208457,918.6245816,4548.556261
0,1,2,1,616.4292264,1033.661611,1213.613946
0,0,2,1,3673.275956,6316.504889,1704.047563
0,1,3,1,1102.850296,304.9154776,904.2769054
0,0,3,1,2714.610724,877.3159168,993.7824266
0,1,2,1,20090.52479,6085.947656,7590.092351
1,1,3,1,9098.780523,3498.802425,1335.317433
1,1,2,0,7196.92957,11977.46854,1240.066816
1,1,3,1,842.3146743,1255.501335,4314.886534
1,0,2,0,13516.51008,4775.164303,1106.604846
1,0,2,0,4460.828162,1083.99043,943.9937287
1,0,3,0,4818.845124,9286.932256,8197.376118
1,0,2,1,2848.023551,4739.321046,1113.329766
1,.,3,0,3050.820193,805.9607077,761.0495075
1,.,2,1,4256.565135,1176.233283,1775.665732
1,1,2,1,5211.416525,1570.093954,291.3352611
1,1,3,0,4181.422692,8242.459467,2573.543707
1,1,3,1,3179.943422,4688.70721,6518.281019
.,.,2,1,3451.708714,4256.625072,1167.477312
.,.,3,1,3018.745468,4729.150346,1565.75873
.,.,3,1,4098.775,1486.742512,1283.25
.,.,3,1,3483.708151,1098.980881,1349.598
;RUN;
/*Section 0.5 -- Sort analysis file on BY variables, as required by PROC SURVEYFREQ*/
PROC SORT data=have;
BY cohort_flag; RUN;
/*** SECTION 1 -- Examine weighted freqs for each group ***/
/*Section 1.1.2 -- Generate weighted freqs*/
title1 "Section 1.1.2 -- Weighted freqs - both person and sample weights V&version.";
PROC SURVEYFREQ data=have missing;
WEIGHT PUFFWGT;
REPWEIGHTS PUFF001-PUFF002;
BY cohort_flag;
TABLES TM_group*ADRD_group*DEM_AGE / nosparse nowt ;
RUN;
Here is one of the output tables, where cohort_flag=1. Note that when ADRD_group=1, DEM_AGE has freqs where '2'=3 and '3'=2, for a total of 5. That's correct, but the percents are roughly 63% and 5%, when they should be 60% and 40% so that they sum to 100%. The total row should read "Total 5 100"
Table of ADRD_group by DEM_AGE | ||||
Controlling for TM_group=0 | ||||
ADRD_group | DEM_AGE | Frequency | Percent | Std Err of |
Percent | ||||
0 | 2 | 2 | 15.8717 | 11.23 |
3 | 2 | 15.018 | 9.1592 | |
Total | 4 | 30.8897 | 19.6118 | |
1 | 2 | 3 | 63.5635 | 25.122 |
3 | 2 | 5.5468 | 22.5428 | |
Total | 5 | 69.1103 | 19.6118 | |
Total | 2 | 5 | 79.4352 | 18.873 |
3 | 4 | 20.5648 | 18.873 | |
Total | 9 | 100 |
If you want a Row percent total in the table then add the ROW option to the Tables statement.
If you want a Column percent total in the table then ad the COLUMN option to the Tables statement.
Otherwise the result is the percent of the TABLE only.
If you want a Row percent total in the table then add the ROW option to the Tables statement.
If you want a Column percent total in the table then ad the COLUMN option to the Tables statement.
Otherwise the result is the percent of the TABLE only.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.