12-17-2012 11:42 AM
Hello, I realize that PROC SQL doesn't inherently have a do-loop function but I'm wondering if there's a way I can simulate it using macros or something else? I currently have a list of 53 three-letter combinations stored as a macro variable in my code. So it's basically "ABC", "ABB", "CBA", etc. I need to calculate % values for each one so that each one gets stored as its own variable. I can get this to work fine if I write 53 lines of code (one per each list value) but I'm wondering if there's a way I can take the values that I've stored in a macro variable and have the code write those repeated 53 lines for me to save space? I realize the final line would not have a comma at the end so I'd be fine with it writing 52 lines and then I'd include the 53rd manually. I've attempted various different %DO's, %DO_OVER, %ARRAY functions but can't seem to get it to do what I want. Any thoughts?
create table TABLENAME as
select GROUPVAR, count(*) as Total_Count,
(sum(case when LISTVALUE="ABC" then 1 else 0 end) / calculated Total_Count as pct_ABC), /*first of 52 repeated items*/
[Repeated 52 times for each 3-letter combo]
(sum(case when LISTVALUE="ZZZ" then 1 else 0 end) / calculated Total_Count as pct_ZZZ) /*final item from list*/
group by GROUPVAR
12-17-2012 11:51 AM
That seems extremely cumbersome.
Why not use proc freq or proc tabulate? Do you really need 53 new variables with percentages or one column per variable type?
12-17-2012 01:12 PM
Thanks for your reply. The reason why I *think* I need the dataset set up as wide w/53 new variables is because I have a separate dataset (Dataset #2) containing an "expected value" associated with each of the 53 groups. For each of the GROUPVAR's, I need to calculate an overall Expected Value based on the % found within each of the 53 groups (simple case-adjustment). See below for a snipped example. Dataset #1 is the one I'm looking to create if possible. If PROC TABULATE or PROC FREQ are better for this, I'm open to that -- I just happen to be more familiar with SQL for creating calculated tables like this. Hope this makes sense -- thanks!
|DATASET #1||%ABC||%ABB||%ACB||%BBB||etc.||%ZZZ||TOTAL EXPECTED VALUE|
12-17-2012 01:38 PM
Use proc freq.
proc freq data=ORIG_TABLE;
table GROUPVAR*listvalue/out=percentages outpct;
Then Transpose your dataset 2 with a proc transpose and then merge the two to get what you need.
12-17-2012 12:43 PM
Why not just use PROC SUMMARY or PROC FREQ? If you really need the information as variables rather than rows then you can use PROC TRANSPOSE.
12-17-2012 02:14 PM
I agree with my colleagues that you are doing work for which there are already existing procs but, from an academic perspective, something like the following (I think) could be used to "loop" through some data using proc sql:
/*create a test dataset*/
data orig_table (keep=groupvar listvalue);
set sashelp.class (rename=(sex=groupvar));
proc sql noprint;
select distinct "sum(case when listvalue='"||strip(listvalue)||
"' then 1 else 0 end) /calculated Total_Count as pct_"||
separated by ","
create table TABLENAME as
select groupvar, count(*) as Total_Count,
group by groupvar