I am trying to perform what I hope is a simple calculation within proc tabulate but I am stumped.
To give you some background I am trying replicate some work in Excel to automatically produce a report using ods rtf/ods when I take the the number of people registered on a course and the number of people who successfully completed the course to calculate the percentage of those people by age groups.
In Excel I used summarised to create the following pivot table:
Pivot Table 1
acad_yr | |||||||
age_HESA | Data | 2009/10 | 2010/11 | 2011/12 | 2012/13 | 2013/14 | Grand Total |
20 & below | Sum of total_registrations | 2750.9 | 3189.8 | 3606.5 | 3514.5 | 3197.0 | 16258.7 |
Sum of total_completions | 1857.8 | 2214.4 | 2543.4 | 2337.0 | 2074.0 | 11026.6 | |
21 to 25 | Sum of total_registrations | 9956.5 | 10542.6 | 11193.1 | 11608.2 | 11746.9 | 55047.4 |
Sum of total_completions | 6461.2 | 6986.4 | 7479.5 | 7513.9 | 7545.7 | 35986.8 | |
26 to 30 | Sum of total_registrations | 12608.1 | 12949.0 | 13015.1 | 12685.3 | 12288.7 | 63546.2 |
Sum of total_completions | 8292.0 | 8755.5 | 8844.3 | 8534.5 | 8190.3 | 42616.5 | |
31 to 35 | Sum of total_registrations | 10561.0 | 11140.7 | 11574.5 | 10958.3 | 10595.3 | 54829.8 |
Sum of total_completions | 7219.2 | 7724.4 | 8109.1 | 7528.0 | 7196.3 | 37777.1 | |
36 to 40 | Sum of total_registrations | 10699.7 | 10396.0 | 9913.5 | 8741.0 | 7921.1 | 47671.3 |
Sum of total_completions | 7516.4 | 7441.0 | 7112.1 | 6162.6 | 5489.9 | 33722.0 | |
41 to 45 | Sum of total_registrations | 9323.0 | 9299.8 | 9170.3 | 8143.8 | 7223.0 | 43160.0 |
Sum of total_completions | 6626.7 | 6775.3 | 6695.5 | 5846.5 | 5127.5 | 31071.5 | |
46 to 50 | Sum of total_registrations | 6500.3 | 6592.8 | 6664.4 | 6035.4 | 5535.6 | 31328.5 |
Sum of total_completions | 4658.5 | 4793.7 | 4876.3 | 4398.6 | 3964.1 | 22691.2 | |
51 to 55 | Sum of total_registrations | 3887.7 | 3925.2 | 3983.7 | 3579.8 | 3256.2 | 18632.6 |
Sum of total_completions | 2777.0 | 2892.2 | 2888.5 | 2615.5 | 2338.0 | 13511.0 | |
56 to 60 | Sum of total_registrations | 2385.7 | 2389.5 | 2435.4 | 2223.7 | 1999.2 | 11433.5 |
Sum of total_completions | 1708.0 | 1744.6 | 1808.1 | 1636.2 | 1480.0 | 8376.9 | |
61 to 65 | Sum of total_registrations | 1778.3 | 1800.2 | 1783.8 | 1555.8 | 1365.5 | 8283.7 |
Sum of total_completions | 1339.5 | 1355.0 | 1365.2 | 1191.0 | 1033.5 | 6284.2 | |
66 & over | Sum of total_registrations | 1403.4 | 1454.3 | 1412.2 | 1373.9 | 1341.2 | 6985.0 |
Sum of total_completions | 991.8 | 1065.8 | 1045.0 | 1042.3 | 992.2 | 5137.1 | |
Total Sum of total_registrations | 71854.7 | 73680.0 | 74752.6 | 70419.7 | 66469.7 | 357176.7 | |
Total Sum of total_completions | 49448.0 | 51748.3 | 52767.0 | 48806.0 | 45431.5 | 248200.8 |
Using this pivot table I then create the below table in Excel which is what gets published to a wider audience
[To calculate the percentages for the specific year and age group I do (Sum of total_completions/Sum of total_registrations)*100]
Final Table 1
2009/10 | 2010/11 | 2011/12 | 2012/13 | 2013/14 | ||||||
No of registrations | % | No of registrations | % | No of registrations | % | No of registrations | % | No of registrations | % | |
20 & below | 1857.8 | 67.5 | 2214.4 | 69.4 | 2543.4 | 70.5 | 2337.0 | 66.5 | 2074.0 | 64.9 |
21 to 25 | 6461.2 | 64.9 | 6986.4 | 66.3 | 7479.5 | 66.8 | 7513.9 | 64.7 | 7545.7 | 64.2 |
25 & below Total | 8319.0 | 65.5 | 9200.8 | 67.0 | 10023.0 | 67.7 | 9850.9 | 65.1 | 9619.7 | 64.4 |
26 to 30 | 8292.0 | 65.8 | 8755.5 | 67.6 | 8844.3 | 68.0 | 8534.5 | 67.3 | 8190.3 | 66.6 |
31 to 35 | 7219.2 | 68.4 | 7724.4 | 69.3 | 8109.1 | 70.1 | 7528.0 | 68.7 | 7196.3 | 67.9 |
26 to 35 Total | 15511.2 | 66.9 | 16479.9 | 68.4 | 16953.4 | 68.9 | 16062.5 | 67.9 | 15386.6 | 67.2 |
36 to 40 | 7516.4 | 70.2 | 7441.0 | 71.6 | 7112.1 | 71.7 | 6162.6 | 70.5 | 5489.9 | 69.3 |
41 to 45 | 6626.7 | 71.1 | 6775.3 | 72.9 | 6695.5 | 73.0 | 5846.5 | 71.8 | 5127.5 | 71.0 |
36 to 45 Total | 14143.0 | 70.6 | 14216.3 | 72.2 | 13807.6 | 72.4 | 12009.1 | 71.1 | 10617.4 | 70.1 |
46 to 50 | 4658.5 | 71.7 | 4793.7 | 72.7 | 4876.3 | 73.2 | 4398.6 | 72.9 | 3964.1 | 71.6 |
51 to 55 | 2777.0 | 71.4 | 2892.2 | 73.7 | 2888.5 | 72.5 | 2615.5 | 73.1 | 2338.0 | 71.8 |
46 to 55 Total | 7435.5 | 71.6 | 7685.9 | 73.1 | 7764.7 | 72.9 | 7014.0 | 72.9 | 6302.1 | 71.7 |
56 to 60 | 1708.0 | 71.6 | 1744.6 | 73.0 | 1808.1 | 74.2 | 1636.2 | 73.6 | 1480.0 | 74.0 |
61 to 65 | 1339.5 | 75.3 | 1355.0 | 75.3 | 1365.2 | 76.5 | 1191.0 | 76.6 | 1033.5 | 75.7 |
66 & over | 991.8 | 70.7 | 1065.8 | 73.3 | 1045.0 | 74.0 | 1042.3 | 75.9 | 992.2 | 74.0 |
56 & over Total | 4039.3 | 72.6 | 4165.4 | 73.8 | 4218.2 | 74.9 | 3869.6 | 75.1 | 3505.7 | 74.5 |
Grand Total | 49448.0 | 68.8 | 51748.3 | 70.2 | 52767.0 | 70.6 | 48806.0 | 69.3 | 45431.5 | 68.3 |
My work in SAS so far
Using the below code I have been able to replicate Pivot Table 1 but I cannot work out how I can alter the code to be able to create Final Table 1
PROC TABULATE DATA=WORK.E_AND_D_SUMMARY (where = (all = '1'));
VAR total_completions
total_registrations;
CLASS acad_yr / ORDER=UNFORMATTED MISSING;
CLASS age / ORDER=UNFORMATTED MISSING;
CLASS ageband / ORDER=UNFORMATTED MISSING;
TABLE
/* ROW Statement */
(ageband *
(age *(total_registrations * Sum=' ' total_completions * Sum=' ' )
all = 'Total' *(total_registrations * Sum=' ' total_completions * Sum=' ' ))
all = 'Total' *(total_registrations * Sum=' ' total_completions * Sum=' ' )) ,
/* COLUMN Statement */
acad_yr ;
;
RUN;
Proc tabulate output
Financial Year derived from course start date | ||||||||
2009/10 | 2010/11 | 2011/12 | 2012/13 | 2013/14 | ||||
Ageband for the Age Groups | Age Groups | 2750.88 | 3189.79 | 3606.54 | 3514.46 | 3199.29 | ||
25 & below | 20 & below | total_registrations | ||||||
total_completions | 1857.75 | 2214.42 | 2543.42 | 2336.96 | 2074.79 | |||
21 to 25 | total_registrations | 9956.54 | 10542.63 | 11193.08 | 11608.21 | 11788.67 | ||
total_completions | 6461.21 | 6986.42 | 7479.54 | 7513.92 | 7546.92 | |||
Total | total_registrations | 12707.42 | 13732.42 | 14799.63 | 15122.67 | 14987.96 | ||
total_completions | 8318.96 | 9200.83 | 10022.96 | 9850.87 | 9621.71 | |||
26 to 35 | Age Groups | 12608.13 | 12949.33 | 13015.08 | 12685.33 | 12347.17 | ||
26 to 30 | total_registrations | |||||||
total_completions | 8291.96 | 8755.5 | 8844.29 | 8533.96 | 8189.04 | |||
31 to 35 | total_registrations | 10561 | 11140.71 | 11574.5 | 10958.29 | 10633.5 | ||
total_completions | 7219.25 | 7724.37 | 8109.62 | 7528 | 7196.08 | |||
Total | total_registrations | 23169.13 | 24090.04 | 24589.58 | 23643.63 | 22980.67 | ||
total_completions | 15511.21 | 16479.87 | 16953.92 | 16061.96 | 15385.12 | |||
36 to 45 | Age Groups | 10699.71 | 10396.17 | 9913.5 | 8741.04 | 7950.08 | ||
36 to 40 | total_registrations | |||||||
total_completions | 7516.37 | 7441 | 7112.12 | 6162.62 | 5489.87 | |||
41 to 45 | total_registrations | 9323.04 | 9299.75 | 9170.33 | 8143.83 | 7259.04 | ||
total_completions | 6626.67 | 6775.29 | 6695.5 | 5846.46 | 5127.54 | |||
Total | total_registrations | 20022.75 | 19695.92 | 19083.83 | 16884.87 | 15209.12 | ||
total_completions | 14143.04 | 14216.29 | 13807.62 | 12009.08 | 10617.42 | |||
46 to 55 | Age Groups | 6500.29 | 6593 | 6664.42 | 6035.42 | 5550.08 | ||
46 to 50 | total_registrations | |||||||
total_completions | 4658.5 | 4793.75 | 4876.75 | 4399.08 | 3963.13 | |||
51 to 55 | total_registrations | 3887.67 | 3925.38 | 3983.71 | 3579.83 | 3262.92 | ||
total_completions | 2776.96 | 2892.17 | 2888.46 | 2615.46 | 2338.21 | |||
Total | total_registrations | 10387.96 | 10518.37 | 10648.12 | 9615.25 | 8813 | ||
total_completions | 7435.46 | 7685.92 | 7765.21 | 7014.54 | 6301.33 | |||
56 & over | Age Groups | 2385.71 | 2389.54 | 2435.42 | 2223.67 | 2000.71 | ||
56 to 60 | total_registrations | |||||||
total_completions | 1708 | 1744.62 | 1808.08 | 1636.21 | 1480 | |||
61 to 65 | total_registrations | 1778.33 | 1800.21 | 1783.83 | 1555.75 | 1365.92 | ||
total_completions | 1339.54 | 1354.96 | 1365.17 | 1191.04 | 1033.63 | |||
66 & over | total_registrations | 1403.37 | 1454.29 | 1412.21 | 1373.92 | 1341.08 | ||
total_completions | 991.79 | 1065.79 | 1045 | 1042.33 | 992.04 | |||
Total | total_registrations | 5567.42 | 5644.04 | 5631.46 | 5153.33 | 4707.71 | ||
total_completions | 4039.33 | 4165.37 | 4218.25 | 3869.58 | 3505.67 | |||
Total | total_registrations | 71854.67 | 73680.79 | 74752.63 | 70419.75 | 66698.46 | ||
total_completions | 49448 | 51748.29 | 52767.96 | 48806.04 | 45431.25 |
I cannot find a way of being able to calculate the percentage of completions against total registrations
Is it possible to get proc tabulate to produce a layout similar to Final Table 1
Thanks in advance
Without having some data to play with this is a bit difficult. BUT if I understand that your issue is with getting the Age band in the same column and above the ages then proc format should fix the problem. I suspect you have the age as a character value (which while Excel does that it isn't the optimum with SAS as you could have your data numeric and group on the fly just using a different format.
Any way here is a stab:
NOTE order of definition in multilabel formats is important
Proc format;
value $agebanding (multilabel)
'20 & below' ='20 & below' /* these duplications are needed because you want both the group and individual values*/
'21 to 25' ='21 to 25'
'20 & below','21 to 25'='25 & below'
'26 to 30' = '26 to 30'
'31 to 35' = '31 to 35'
'26 to 30','31 to 35' = '26 to 35'
<continue in this fashion>
;
/* if you had age as a numeric */
value agegbanding (multilabel)
0 - 20 = '20 & below'
21 - 25= '21 to 25'
0 - 25 = '25 & below'
26 - 30 = '26 to 30'
31 - 35 = '31 to 35'
26 - 35= '26 to 35'
<and again continue>
;
run;
In your proc tabulate modify the class statement for age
Class age / mlf ;
format age $agebanding.;
And do not use an ALL on age.
You may need to add an ORDER to the CLASS statement for AGE but with out data to test on I'm not sure what it may need to be.
I was unable to attach data to the thread and I therefore went to SAS to get assistance.
They provided me with the following code which has worked (There have been soime slight changes to the variable names compared to the above)
proc tabulate data=e_and_d_summary (where=(course_group="Undergraduate"));
var total_regftes total_compftes;
class age_hesa / order=unformatted missing;
class ageband_hesa / order=unformatted missing;
class acad_yr / order=unformatted missing;
table
/* Row Dimension */
(ageband_hesa=""*(age_hesa="" ALL="Total"))
all = "Total",
/* column dimension */
acad_yr="Year"*total_compFTEs=""*(sum="RegFTE No." pctsum<total_regFTEs>="%")
;
run;
I would suggest to use proc report . Post it at ODS and Base Reporting
And Cynthia will give good advice .
Xia Keshan
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.