Hello!
I was wondering if there is a way to do a pie chart where each slice is a different variable?
The person working on the data before me left a mess and I am trying to make sense of a few things. I have a very large dataset that i subsetted into the variables that I need. Ideally, I would want a piechart grouped by the variable Tricho_1yr (3 rankings) with 14 slices that are the sum of all the numbers in each of the expenditure values across the rankings. I would replicate this 4 times for 4 different quarters.
The issue is that my data is wide and not long. Below I try to give you a sample of my data, but it is a mess and hard to read.
label STUDYNO="Sample size: 55,144,41,185,191, Total:616" PID="SUBJID#" Day_1826="Day1826" Day_InitialIW="IWInitial" TRICHOT_1YR="RA 273, UR 140, RNA 80, Missing 123" PostResYr_HOUSE_P_Tot="Housing Total/Total Expenditure" PostResYr_CONSUME_P_Tot="Consumab
le Goods Total/Total Expenditure" PostResYr_DURABLE_P_Tot="Durable Goods Total/Total Expenditure" PostResYr_TRANS_P_Tot="Transportation Total/Total Expenditure" PostResYr_HEALTH_P_Tot="Health Care Total/Total Expenditure" PostResYr_INSURE_P_Tot="Insurance
Total/Total Expenditure" PostResYr_LOAN_P_Tot="Loan Payments Total/Total Expenditure" PostResYr_TAX_P_Tot="Taxes Total/Total Expenditure" PostResYr_EDUCA_P_Tot="Educational Expenses Total/Total Expenditure" PostResYr_GIFT_P_Tot="Gifts or Contributions Tota
l/Total Expenditure" PostResYr_ENTER_P_Tot="Entertainment Total/Total Expenditure" PostResYr_RECRE_P_Tot="RECREATION TOTAL/Total Expenditure" PostResYr_FINPLAN_P_Tot="Financial Planning Total/Total Expenditure" PostResYr_LEGALA_P_Tot="LEGAL EXPENSES TOTAL/
Total Expenditure" PostResYr1stQ_HOUSE_P_Tot="Housing Total/Total Expenditure" PostResYr1stQ_CONSUME_P_Tot="Consumable Goods Total/Total Expenditure" PostResYr1stQ_DURABLE_P_Tot="Durable Goods Total/Total Expenditure" PostResYr1stQ_TRANS_P_Tot="Transportat
ion Total/Total Expenditure" PostResYr1stQ_HEALTH_P_Tot="Health Care Total/Total Expenditure" PostResYr1stQ_INSURE_P_Tot="Insurance Total/Total Expenditure" PostResYr1stQ_LOAN_P_Tot="Loan Payments Total/Total Expenditure" PostResYr1stQ_TAX_P_Tot="Taxes Tot
al/Total Expenditure" PostResYr1stQ_EDUCA_P_Tot="Educational Expenses Total/Total Expenditure" PostResYr1stQ_GIFT_P_Tot="Gifts or Contributions Total/Total Expenditure" PostResYr1stQ_ENTER_P_Tot="Entertainment Total/Total Expenditure" PostResYr1stQ_RECRE_P
_Tot="RECREATION TOTAL/Total Expenditure" PostResYr1stQ_FINPLAN_P_Tot="Financial Planning Total/Total Expenditure" PostResYr1stQ_LEGALA_P_Tot="LEGAL EXPENSES TOTAL/Total Expenditure" PostResYr2ndQ_HOUSE_P_Tot="Housing Total/Total Expenditure" PostResYr2ndQ
_CONSUME_P_Tot="Consumable Goods Total/Total Expenditure" PostResYr2ndQ_DURABLE_P_Tot="Durable Goods Total/Total Expenditure" PostResYr2ndQ_TRANS_P_Tot="Transportation Total/Total Expenditure" PostResYr2ndQ_HEALTH_P_Tot="Health Care Total/Total Expenditure
" PostResYr2ndQ_INSURE_P_Tot="Insurance Total/Total Expenditure" PostResYr2ndQ_LOAN_P_Tot="Loan Payments Total/Total Expenditure" PostResYr2ndQ_TAX_P_Tot="Taxes Total/Total Expenditure" PostResYr2ndQ_EDUCA_P_Tot="Educational Expenses Total/Total Expenditur
e" PostResYr2ndQ_GIFT_P_Tot="Gifts or Contributions Total/Total Expenditure" PostResYr2ndQ_ENTER_P_Tot="Entertainment Total/Total Expenditure" PostResYr2ndQ_RECRE_P_Tot="RECREATION TOTAL/Total Expenditure" PostResYr2ndQ_FINPLAN_P_Tot="Financial Planning To
tal/Total Expenditure" PostResYr2ndQ_LEGALA_P_Tot="LEGAL EXPENSES TOTAL/Total Expenditure" PostResYr3rdQ_HOUSE_P_Tot="Housing Total/Total Expenditure" PostResYr3rdQ_CONSUME_P_Tot="Consumable Goods Total/Total Expenditure" PostResYr3rdQ_DURABLE_P_Tot="Durab
le Goods Total/Total Expenditure" PostResYr3rdQ_TRANS_P_Tot="Transportation Total/Total Expenditure" PostResYr3rdQ_HEALTH_P_Tot="Health Care Total/Total Expenditure" PostResYr3rdQ_INSURE_P_Tot="Insurance Total/Total Expenditure" PostResYr3rdQ_LOAN_P_Tot="L
oan Payments Total/Total Expenditure" PostResYr3rdQ_TAX_P_Tot="Taxes Total/Total Expenditure" PostResYr3rdQ_EDUCA_P_Tot="Educational Expenses Total/Total Expenditure" PostResYr3rdQ_GIFT_P_Tot="Gifts or Contributions Total/Total Expenditure" PostResYr3rdQ_E
NTER_P_Tot="Entertainment Total/Total Expenditure" PostResYr3rdQ_RECRE_P_Tot="RECREATION TOTAL/Total Expenditure" PostResYr3rdQ_FINPLAN_P_Tot="Financial Planning Total/Total Expenditure" PostResYr3rdQ_LEGALA_P_Tot="LEGAL EXPENSES TOTAL/Total Expenditure" P
ostResYr4thQ_HOUSE_P_Tot="Housing Total/Total Expenditure" PostResYr4thQ_CONSUME_P_Tot="Consumable Goods Total/Total Expenditure" PostResYr4thQ_DURABLE_P_Tot="Durable Goods Total/Total Expenditure" PostResYr4thQ_TRANS_P_Tot="Transportation Total/Total Expe
nditure" PostResYr4thQ_HEALTH_P_Tot="Health Care Total/Total Expenditure" PostResYr4thQ_INSURE_P_Tot="Insurance Total/Total Expenditure" PostResYr4thQ_LOAN_P_Tot="Loan Payments Total/Total Expenditure" PostResYr4thQ_TAX_P_Tot="Taxes Total/Total Expenditure
" PostResYr4thQ_EDUCA_P_Tot="Educational Expenses Total/Total Expenditure" PostResYr4thQ_GIFT_P_Tot="Gifts or Contributions Total/Total Expenditure" PostResYr4thQ_ENTER_P_Tot="Entertainment Total/Total Expenditure" PostResYr4thQ_RECRE_P_Tot="RECREATION TOT
AL/Total Expenditure" PostResYr4thQ_FINPLAN_P_Tot="Financial Planning Total/Total Expenditure" PostResYr4thQ_LEGALA_P_Tot="LEGAL EXPENSES TOTAL/Total Expenditure";
datalines;
2 486 30DEC1992 05OCT1993 1 0.2206379704 0.0740041108 . 0.1281328319 0.0582095192 . 0.0592032887 . . 0.3090553333 0.0158580237 0 . 0 0.1460914139 0.0427167877 . 0.166595472 0.0153780436 . 0.0170867151 . . 0.4527979496 0.010252029 0 . 0 0.3930432118
0.1149249157 . 0.0459699663 0.1342323016 . 0.1838798652 . . 0 0.0160894882 0 . 0 0.4021006427 0.1175732873 . 0.0470293149 0.1299576736 . 0.1881172598 . . 0 0.0164602602 0 . 0 0.1716809339 0.0910774185 . 0.1420807729 0.062661264 . 0 . . 0.3658279347
0.0273232256 0 . 0
2 495 05MAR1993 25AUG1993 3 0.2533392991 0.119096185 . 0.0085538696 0.047112082 . 0 . . 0.0131597994 0.0356630565 0 . 0 0.295661769 0.1509959928 . 0.0087113073 0.0290376909 . 0 . . 0 0.0278761833 0 . 0 0.4112680381 0.195732269 . 0.0153515505 0.0107460854 .
0 . . 0 0.0560331593 0 . 0 0.3346513573 0.1482776975 . 0.0114059767 0 . 0 . . 0 0.0570298837 0 . 0 0.1350372802 0.0598324691 . 0.0046024976 0.0902089534 . 0 . . 0.0306833175 0.0230124881 0 . 0
2 496 22JUN1993 30AUG1993 3 0.2596819952 0.0963660052 . 0.0893913669 0.0056213503 . 0.0749513374 . . 0.0159866444 0.0130495632 0.0180240121 . 0.0059485188 0.2595605549 0.0416295231 . 0.1810645004 0.0002392501 . 0.0241164134 . . 0.0083737546 0.0033973519 0
. 0 0.3066371174 0.1429062703 . 0.0378281304 0.001146307 . 0.1155477437 . . 0 0.0137556838 0 . 0 0.2539074859 0.1183319623 . 0.0782762767 0.0210719484 . 0.0956780358 . . 0 0.0223058913 0.0094918686 . 0 0.2298570733 0.1071234211 . 0.0283561997 0.0014321313
. 0.0866153009 . . 0.0515567267 0.015696159 0.0608369375 . 0.0229141008
2 533 21AUG1993 15NOV1993 1 0.2807925044 0.2210618778 . 0.091255124 0 . 0.10833496 . . 0.0094345761 0.065066042 0.0130132084 . 0 0.3424818993 0.2424650615 . 0.0572486951 0 . 0.1212325307 . . 0.0151540663 0.0008418926 0 . 0 0.2162029299 0.1754989677 .
0.1528856553 0 . 0.085537312 . . 0.0196637499 0.0653819683 0 . 0 0.3091087996 0.2509136913 . 0.0639583919 0 . 0.122294068 . . 0 0.0934776497 0 . 0 0.2926537131 0.2375565611 . 0.0605536332 0 . 0.1157838701 . . 0 0.0885014639 0.0532339633 . 0
2 557 15JUN1993 16NOV1993 3 0.1911581952 0.1958830116 . 0.1699949553 0 . 0.0855142544 . . 0.0167337246 0.0032483112 0.0004921684 . 0 0.2398427261 0.5766710354 . 0.0397553517 0 . 0.0884665793 . . 0 0.0170380079 0 . 0 0.2266657042 0.2540965856 . 0.1654515267
0 . 0.0967299502 . . 0.079405183 0.0077961452 0 . 0 0.1483161752 0.1035305066 . 0.1506427034 0 . 0.0697958471 . . 0.0075612168 0 0.0011632641 . 0 0.2135857274 0.1490912137 . 0.2504397353 0 . 0.1005109306 . . 0 0 0 . 0
2 688 17DEC1993 16MAR1994 3 0.2931831752 0.1577749314 . 0.0683423694 0.0776550975 . 0.1172868659 . . 0.006413085 0.0080891058 0 . 0 0.1312013142 0.0986386855 . 0.0160215972 0.0133722999 . 0.0530812008 . . 0.0289948847 0 0 . 0 0.3343801101 0.1705562978 .
0.0822859331 0.1349489304 . 0.132405547 . . 0 0.00615798 0 . 0 0.334203543 0.1751687536 . 0.0814381047 0.0623846614 . 0.1359862692 . . 0 0.0063245139 0 . 0 0.349270049 0.178151166 . 0.0859501239 0.0892318559 . 0.138301563 . . 0 0.0189340309 0 . 0
2 724 31AUG1993 01MAR1994 2 0.3664520868 0.1343249576 . 0.065054087 0.0462156069 . 0.2385203171 . . 0.0170031592 0.0070053016 0.0034006318 . 0 0.3965744416 0.1026635019 . 0.0375455093 0.1173297164 . 0.2740822176 . . 0 0.0023465943 0 . 0 0.269151996
0.0756492297 . 0.0891864602 0.039815384 . 0.1860174742 . . 0.0796307681 0.0238892304 0 . 0 0.3900514326 0.177731112 . 0.0735533371 0.0149061716 . 0.2399370012 . . 0 0.0024608923 0 . 0 0.3913628563 0.1783286762 . 0.0679020729 0 . 0.2407437129 . . 0
0.0024691663 0.0137175905 . 0
2 831 26NOV1993 04APR1994 1 0.361376673 0.073040153 . 0.1077119184 0.1370299554 . 0.1219885277 . . 0.0101975781 0.0136392607 0.0191204589 . 0 0 0.1449459631 . 0.0330578512 0.1830896376 . 0.1048951049 . . 0.0305149396 0.0305149396 0 . 0 0.3441317047
0.1227827934 . 0.0106213489 0.1274561869 . 0.1698353691 . . 0.0424853956 0.009346787 0 . 0 0.4019851117 0.047311828 . 0.2522746071 0.0099255583 . 0.1081885856 . . 0 0.0109181141 0.0496277916 . 0 0.4563380282 0.0537089202 . 0.014084507 0.2685446009 .
0.1228169014 . . 0 0.0123943662 0 . 0
2 885 18NOV1993 26APR1994 1 0.4789861949 0.091220423 . 0.0335486241 0.0875447672 . 0 . . 0.2246778291 0.0015586642 0 . 0 0.6764631628 0.1383394835 . 0.0155628028 0 . 0 . . 0.0121044022 0 0 . 0 0.3808604018 0.067178245 . 0.0445894847 0.1392204434 . 0 . .
0.3504850323 0.002478708 0 . 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 907 13DEC1993 18MAY1994 2 0.0930468131 0.1841751352 . 0.267917267 0.0441252928 . 0.1741030575 . . 0.0326143469 0.0083454358 0 . 0.0191849099 0.1468605889 0.3263568641 . 0.1780773324 0 . 0.1755941823 . . 0.008868393 0.0223483505 0 . 0 0.0412113162
0.0741210723 . 0.3910627776 0.0177890574 . 0.0978398155 . . 0.0192714788 0.0017789057 0 . 0.0148242145 0.1071566234 0.1927277399 . 0.249775151 0.0693819864 . 0.2544006167 . . 0 0.0046254658 0 . 0.038545548 0.0911575036 0.1639523445 . 0.2124822385
0.1158596568 . 0.2164170948 . . 0.109301563 0.0039348563 0 . 0.0327904689
2 936 28DEC1992 30MAR1994 3 0.1223633706 0.2276914332 . 0.0553024266 0.1386632518 . 0.142556443 . . 0.0373945785 0.0297079151 0.0105951306 . 0 0.0794634836 0.1976736244 . 0.081959031 0.0397317418 . 0.080645585 . . 0.0131344601 0.0288958122 0 . 0
0.1467517074 0.2492897594 . 0.0460950876 0.1903256759 . 0.1766664785 . . 0 0.0310436304 0.0310436304 . 0 0.147228147 0.2500990959 . 0.0673851904 0.1909435814 . 0.1772400385 . . 0 0.0311444157 0.0084939316 . 0 0.1335364914 0.2268408348 . 0.0179760662
0.1731865573 . 0.1607573916 . . 0.136960504 0.028248104 0.0077040284 . 0
2 992 01JAN1993 19MAY1994 1 0.1444450025 0.1221615635 . 0.0785801158 0.0123973484 . 0.0079470182 . . 0.0400529716 0.2762542461 0.2066224728 . 0 0.0982691234 0.0798436628 . 0.0437744277 0.0100502513 . 0 . . 0.0067001675 0.5371300949 0 . 0 0.1268880836
0.1501601339 . 0.0786816938 0 . 0 . . 0 0.421736649 0.2077861631 . 0 0.4304929363 0.3202065363 . 0.1585367 0 . 0 . . 0 0.0407307727 0 . 0 0.1136223619 0.0845138675 . 0.0831907541 0.024808376 . 0.0206736466 . . 0.0992335039 0.0583823781 0.4134729328 . 0
2 1034 07JAN1994 26JUL1994 1 0.2935014118 0.1913426064 . 0.0683742474 0.0577527045 . 0.1813094288 . . 0.0226056162 0.0103738102 0.0131298374 . 0 0.3272611058 0.1980398423 . 0.1065303079 0.0792585491 . 0.189623948 . . 0.0085224246 0.0170448493 0.0034089699
. 0 0.3377146991 0.2080751524 . 0.0686924876 0.0584587088 . 0.2460312565 . . 0.00560755 0.0119160438 0 . 0 0.3174915056 0.178723756 . 0.015575055 0.0161740956 . 0.1803112136 . . 0 0.0077875275 0.0479232461 . 0 0.2164194301 0.1850821012 . 0.0857491643
4 The SAS System 09:44 Tuesday, December 11, 2018
0.0772561086 . 0.1289307243 . . 0.0644653621 0.0066511882 0 . 0
2 1035 16FEB1994 21JUN1994 1 0.3806066891 0.0980036298 . 0.1239305159 0.0474980555 . 0.0534093855 . . 0.0679284418 0.0145190563 0.0194451646 . 0 0.5304518664 0.1452427729 . 0.0294695481 0.0084198709 . 0.2168116756 . . 0 0 0 . 0 0.45875 0.118125 . 0.18625
0.00225 . 0.064375 . . 0 0.0175 0.09375 . 0 0.6119857796 0.1523616049 . 0.093956323 0.0045708481 . 0 . . 0 0.0355510411 0 . 0 0.206135599 0.0513200661 . 0.1439812967 0.0984775047 . 0 . . 0.1493984148 0.0119746821 0 . 0
2 1051 13MAR1994 07JUL1994 1 0.3317276134 0.0363172016 . 0.0779848438 0.0115534003 . 0.3586370028 . . 0.0708709896 0.0081065345 0 . 0 0.5484502691 0 . 0.0307843618 0 . 0.1702401618 . . 0 0.0046273852 0 . 0 0.2258039225 0.0103987508 . 0.054413527
0.015680656 . 0.6165559667 . . 0.0108113997 0.0011554168 0 . 0 0.4282522989 0.0856508135 . 0.1267295981 0.0132654848 . 0.2958574544 . . 0 0.0022993507 0 . 0 0.2773267615 0.062384747 . 0.1080588173 0.0122794111 . 0.1786077711 . . 0.2349104724 0.0227436048 0
. 0
;;;;
I'm working on something akin to this but can't make sense of how to accomplish this goal as I am very new to graphs/charts in SAS.
proc gchart data=Piecharts(where=(Trichot_1yr="1" or Trichot_1yr="2" or Trichot_1yr="3" ));
pie sum(PostResYr1stQ_:) / group=Trichot_1yr
across=3
other=0
clockwise value=none
slice=outside percent=outside;
run;
quit;
Help would be much appreciated.
I am going to submit for consideration that each of these variables contains two pieces of data in the name.
PostResYr_AlcSpend="Amount spent on alcohol" PostResYr_HOUSE="Housing Total" PostResYr_CONSUME="Consumable Goods Total" PostResYr_DURABLE="Durable Goods Total (missing Study 2)" PostResYr_TRANS="Transportation Total" PostResYr_HEALTH="Health Care Total" PostResYr_INSURE="Insurance Total (missing Study 2)" PostResYr_LOAN="Loan Payments Total" PostResYr_TAX="Taxes Total (missing Study 2)" PostResYr_EDUCA="Educational Expenses Total (missing Study 2)" PostResYr_GIFT="Gifts or Contributions Total" PostResYr_ENTER="Entertainment Total" PostResYr_RECRE="RECREATION TOTAL" PostResYr_FINPLAN="Financial Planning Total (missing Study 2)" PostResYr_LEGALA="LEGAL EXPENSES TOTAL" PostResYr_INCOMETL="Income Total";
This is apparently a 1)time frame and 2) expenditure type or similar idea.
From your first set of variables it appears (from a quick scan, that mass of text is hard to read) that each of these are repeated for each quarter.
For a large number of purposes the data would be more useable as with a variable that indicated the time frame and the expenditure type plus a single numeric representing the amount.
Something like this would do that transposition:
data trans; set have; array _pp PostRes: ; length TimeFrame ExpenseType $ 20; do i= 1 to dim(_pp); TimeFrame = scan(vname(_pp[i]),1,'_'); ExpenseType = scan(vname(_pp[i]),2,'_'); Value = _pp[i]; output; end; keep stydyno pid trichot_1yr timeframe expensetype value; run;
Then create custom formats to get nice text for the timeframe and expensetype variables.
If this were my data I might even parse that timeframe finer. If ALL of the values of timeframe start with PostRes I would likely remove that part as repeated text is of little use frequently. If the PostRes implies there are corresponding PreRes values then that would be a definite additional parse to create a variable for "Pre" or "Post".
The additional variables make it easier to label time by quarter (your requirement) as you could select with something like: Where timeframe in ('value 1' 'value 2' <etc>); (Not looking up all your quarter abbreviations). If you parsed the variable name to have an actual numeric value of 1, 2, 3, or 4 (and missing or other code such as 0 for an annual) for the quarter then even easier to select.
You might consider "donut" instead of pie as each ring represents a level of a grouping variable such as your Trichot_1yr. You would slice by ExpenseType and group by Trichot_1yr
BTW, Pie charts with that many categories are often not very useful as the human eye / mind does not actually detect small differences of area and the more categories involved the harder to actually see.
If the main issue is wide vs. long then transpose the data to make it long.
If you are changing processes it would be a good time to change from GCHART to either Proc SGPLOT/SGPANEL or the Graphics Template language, especially if you are new to SAS as these are the places where most of the changes and improvements to graphics are occurring.
I would suggest starting with a smaller data set with fewer variables and fewer levels so discussion about the data is easier. Also include the actual INPUT statement to read the data lines.
If you provide an example with perhaps 3 groups and 3 or 4 subgroups of then working with a smaller example should scale up somewhat easier than trying to work with large numbers for learning.
By "quarters" do you mean "calendar quarters" or "quarter of the pie"?
The data is very hard to transpose as the variable times are within the variable names such that "PreResyr1stQ_Housing" and so forth. However, you reminded me that I had a transposed dataset already, thank you!
I shall change my processes to keep with the times and stay up to date. I appreciate the suggestion! Is there a preference of SGPLOT or Graphics Template language based on limited SAS knowledge?
I am not sure if you simply meant include the input in the data like this. However, my data looks something like this:
input STUDYNO:32. PID:BEST. TRICHOT_1YR:BEST12. PostResYr_AlcSpend:DOLLAR14.2 PostResYr_HOUSE:DOLLAR14.2 PostResYr_CONSUME:DOLLAR14.2 PostResYr_DURABLE:DOLLAR14.2 PostResYr_TRANS:DOLLAR14.2 PostResYr_HEALTH:DOLLAR14.2 PostResYr_INSURE:DOLLAR14.2 PostResYr_
LOAN:DOLLAR14.2 PostResYr_TAX:DOLLAR14.2 PostResYr_EDUCA:DOLLAR14.2 PostResYr_GIFT:DOLLAR14.2 PostResYr_ENTER:DOLLAR14.2 PostResYr_RECRE:DOLLAR14.2 PostResYr_FINPLAN:DOLLAR14.2 PostResYr_LEGALA:DOLLAR14.2 PostResYr_INCOMETL:DOLLAR14.2 Quarter:32.;
format PID BEST. TRICHOT_1YR BEST12. PostResYr_AlcSpend DOLLAR14.2 PostResYr_HOUSE DOLLAR14.2 PostResYr_CONSUME DOLLAR14.2 PostResYr_DURABLE DOLLAR14.2 PostResYr_TRANS DOLLAR14.2 PostResYr_HEALTH DOLLAR14.2 PostResYr_INSURE DOLLAR14.2 PostResYr_LOAN DOLLAR
14.2 PostResYr_TAX DOLLAR14.2 PostResYr_EDUCA DOLLAR14.2 PostResYr_GIFT DOLLAR14.2 PostResYr_ENTER DOLLAR14.2 PostResYr_RECRE DOLLAR14.2 PostResYr_FINPLAN DOLLAR14.2 PostResYr_LEGALA DOLLAR14.2 PostResYr_INCOMETL DOLLAR14.2;
label STUDYNO="Sample size: 55,144,41,185,191, Total:616" PID="SUBJID#" TRICHOT_1YR="RA 273, UR 140, RNA 80, Missing 123" PostResYr_AlcSpend="Amount spent on alcohol" PostResYr_HOUSE="Housing Total" PostResYr_CONSUME="Consumable Goods Total" PostResYr_DURA
BLE="Durable Goods Total (missing Study 2)" PostResYr_TRANS="Transportation Total" PostResYr_HEALTH="Health Care Total" PostResYr_INSURE="Insurance Total (missing Study 2)" PostResYr_LOAN="Loan Payments Total" PostResYr_TAX="Taxes Total (missing Study 2)"
PostResYr_EDUCA="Educational Expenses Total (missing Study 2)" PostResYr_GIFT="Gifts or Contributions Total" PostResYr_ENTER="Entertainment Total" PostResYr_RECRE="RECREATION TOTAL" PostResYr_FINPLAN="Financial Planning Total (missing Study 2)" PostResYr_L
EGALA="LEGAL EXPENSES TOTAL" PostResYr_INCOMETL="Income Total";
datalines;
2 486 1 $0.00 $5,119.48 $1,496.92 . $5,838.00 $538.89 . $598.77 . . $15,867.38 $359.26 $0.00 . $0.00 $44,907.68 1
2 486 1 $0.00 $3,839.61 $1,122.69 . $449.08 $1,311.30 . $1,796.31 . . $0.00 $157.18 $0.00 . $0.00 $0.00 2
2 486 1 $0.00 $3,839.61 $1,122.69 . $449.08 $1,240.95 . $1,796.31 . . $0.00 $157.18 $0.00 . $0.00 $0.00 3
2 486 1 $0.00 $2,821.70 $1,496.92 . $2,335.20 $1,029.88 . $0.00 . . $6,012.64 $449.08 $0.00 . $0.00 $0.00 4
2 495 3 $0.00 $7,620.83 $3,892.00 . $224.54 $748.46 . $0.00 . . $0.00 $718.52 $0.00 . $0.00 $23,801.07 1
The goal is to make a pie chart for each year quarter (which is the last column in the data) by each Trichot_1yr (1-3). The pie "slices" would be each of the 14 different variable columns.
Simplified my data looks like this:
ID Quarter Trichot_1yr Housing Transportation Insurance
1 1 1 1000 2000 500
1 2 1 900 100 6000
1 3 1
1 4 1
2 1 3
2 2 3
I want each of the variables (ie "housing", "transportation", etc) to be the slices of the pie. With 3 different pies for each trichot (1, 2, and 3) for quarter 1.
Then I would repeat for each quarter.
I am not sure how to accomplish this as it has so many levels.
I can definitely respond for clarity, and thank you for your response!
I am going to submit for consideration that each of these variables contains two pieces of data in the name.
PostResYr_AlcSpend="Amount spent on alcohol" PostResYr_HOUSE="Housing Total" PostResYr_CONSUME="Consumable Goods Total" PostResYr_DURABLE="Durable Goods Total (missing Study 2)" PostResYr_TRANS="Transportation Total" PostResYr_HEALTH="Health Care Total" PostResYr_INSURE="Insurance Total (missing Study 2)" PostResYr_LOAN="Loan Payments Total" PostResYr_TAX="Taxes Total (missing Study 2)" PostResYr_EDUCA="Educational Expenses Total (missing Study 2)" PostResYr_GIFT="Gifts or Contributions Total" PostResYr_ENTER="Entertainment Total" PostResYr_RECRE="RECREATION TOTAL" PostResYr_FINPLAN="Financial Planning Total (missing Study 2)" PostResYr_LEGALA="LEGAL EXPENSES TOTAL" PostResYr_INCOMETL="Income Total";
This is apparently a 1)time frame and 2) expenditure type or similar idea.
From your first set of variables it appears (from a quick scan, that mass of text is hard to read) that each of these are repeated for each quarter.
For a large number of purposes the data would be more useable as with a variable that indicated the time frame and the expenditure type plus a single numeric representing the amount.
Something like this would do that transposition:
data trans; set have; array _pp PostRes: ; length TimeFrame ExpenseType $ 20; do i= 1 to dim(_pp); TimeFrame = scan(vname(_pp[i]),1,'_'); ExpenseType = scan(vname(_pp[i]),2,'_'); Value = _pp[i]; output; end; keep stydyno pid trichot_1yr timeframe expensetype value; run;
Then create custom formats to get nice text for the timeframe and expensetype variables.
If this were my data I might even parse that timeframe finer. If ALL of the values of timeframe start with PostRes I would likely remove that part as repeated text is of little use frequently. If the PostRes implies there are corresponding PreRes values then that would be a definite additional parse to create a variable for "Pre" or "Post".
The additional variables make it easier to label time by quarter (your requirement) as you could select with something like: Where timeframe in ('value 1' 'value 2' <etc>); (Not looking up all your quarter abbreviations). If you parsed the variable name to have an actual numeric value of 1, 2, 3, or 4 (and missing or other code such as 0 for an annual) for the quarter then even easier to select.
You might consider "donut" instead of pie as each ring represents a level of a grouping variable such as your Trichot_1yr. You would slice by ExpenseType and group by Trichot_1yr
BTW, Pie charts with that many categories are often not very useful as the human eye / mind does not actually detect small differences of area and the more categories involved the harder to actually see.
HI again @ballardw!
You are absolutely correct in saying that each variable contains two pieces of data in the name. For the giant dataset, I started to "correct" the data as you specified. I have already pulled out the time in the sense of half year, quarter, and month. It would be wise to pull out the pre and post.
In the subset of data, they are not repeated for each quarter because I did pull out the time, but are repeated for pre and post.
In this particular case, I am just looking at post by quarter and the trichot is a different status for each.
I thought about donut, but once the data is parsed, I simply want each of the 14 expenditures by each status. So that would be 14 slices in each of the 3 pies. Then for each quarter. However, since I have multiple ID's, I am not sure if I need to sum all the ID's by each expenditure category first or there is a way that I can do this in SAS while making the graph.
However, I will probably pull out the Pre and Post part. Thank you again for the reply.
I had thought to not use a pie chart, but I am not sure what would better represent that data visually.
Bar charts are often a better approach especially if several of the values displayed are similar in magnitude as the human eye can actually detect pretty small variations of the tops of the bars.
To make multiple charts, of which every style you choose you could use SGPLOT with BY group processing for which requires the data to be sorted by those variables. Or SGPANEL with Panelby though that takes a bit more work but places all the graphs in a single image is specified correctly.
The vbar (or hbar) in sgplot lets you specify a category variable and optionally a response variable (such as expenditure), a statistic such as Freq, Mean, Median, Percent or SUM which will be applied to the response variable (mean, median and sum kind of require a response variable) and a group variable to display the sum for sub-groups of your category variable. The groupdisplay option controls how the subgroups are used on the graph, stack places them one above the other and cluster side-by-side.
so you could do something like this in SGPLOT
vbar quarter /response=value stat=sum group=expenditure groupdisplay=stack would have one bar per quarter with the values summed by expenditure type.
If you do a BY trichot you would get one bar chart be level of the trichot variable.
Or as I mentioned SGPANEL.
It is also simple to create a second graph by changing role of expenditure and quarter so that for the same type of expenditure you would a sum per quarter which may actually be easier to interpret as one might expect the relative quarter-to-quarter expenditures to be similar.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.