Hello, I realize that PROC SQL doesn't inherently have a do-loop function but I'm wondering if there's a way I can simulate it using macros or something else? I currently have a list of 53 three-letter combinations stored as a macro variable in my code. So it's basically "ABC", "ABB", "CBA", etc. I need to calculate % values for each one so that each one gets stored as its own variable. I can get this to work fine if I write 53 lines of code (one per each list value) but I'm wondering if there's a way I can take the values that I've stored in a macro variable and have the code write those repeated 53 lines for me to save space? I realize the final line would not have a comma at the end so I'd be fine with it writing 52 lines and then I'd include the 53rd manually. I've attempted various different %DO's, %DO_OVER, %ARRAY functions but can't seem to get it to do what I want. Any thoughts?
Thanks.
proc sql;
create table TABLENAME as
select GROUPVAR, count(*) as Total_Count,
(sum(case when LISTVALUE="ABC" then 1 else 0 end) / calculated Total_Count as pct_ABC), /*first of 52 repeated items*/
.
[Repeated 52 times for each 3-letter combo]
.
(sum(case when LISTVALUE="ZZZ" then 1 else 0 end) / calculated Total_Count as pct_ZZZ) /*final item from list*/
from ORIG_TABLE
group by GROUPVAR
quit;
That seems extremely cumbersome.
Why not use proc freq or proc tabulate? Do you really need 53 new variables with percentages or one column per variable type?
Thanks for your reply. The reason why I *think* I need the dataset set up as wide w/53 new variables is because I have a separate dataset (Dataset #2) containing an "expected value" associated with each of the 53 groups. For each of the GROUPVAR's, I need to calculate an overall Expected Value based on the % found within each of the 53 groups (simple case-adjustment). See below for a snipped example. Dataset #1 is the one I'm looking to create if possible. If PROC TABULATE or PROC FREQ are better for this, I'm open to that -- I just happen to be more familiar with SQL for creating calculated tables like this. Hope this makes sense -- thanks!
DATASET #1 | %ABC | %ABB | %ACB | %BBB | etc. | %ZZZ | TOTAL EXPECTED VALUE |
GROUPVAR1 | 0.45 | 0 | 0.08 | 0.09 | ~ | 0.04 | |
GROUPVAR2 | 0.04 | 0.08 | 0.18 | 0.42 | ~ | 0.05 | |
GROUPVAR3 | 0 | 0 | 0.85 | 0.02 | ~ | 0.01 | |
DATASET #2 | ABC | ABB | ACB | BBB | etc. | ZZZ | |
EXPECTED HRS/DAY | 12 | 3 | 24 | 4 | ~ | 22 |
Use proc freq.
proc freq data=ORIG_TABLE;
table GROUPVAR*listvalue/out=percentages outpct;
quit;
Then Transpose your dataset 2 with a proc transpose and then merge the two to get what you need.
PROC FREQ worked perfectly. Thanks all!
Why not just use PROC SUMMARY or PROC FREQ? If you really need the information as variables rather than rows then you can use PROC TRANSPOSE.
I agree with my colleagues that you are doing work for which there are already existing procs but, from an academic perspective, something like the following (I think) could be used to "loop" through some data using proc sql:
/*create a test dataset*/
data orig_table (keep=groupvar listvalue);
set sashelp.class (rename=(sex=groupvar));
listvalue=put(age,2.);
run;
proc sql noprint;
select distinct "sum(case when listvalue='"||strip(listvalue)||
"' then 1 else 0 end) /calculated Total_Count as pct_"||
strip(listvalue)
into :loop
separated by ","
from orig_table
;
create table TABLENAME as
select groupvar, count(*) as Total_Count,
&loop.
from orig_table
group by groupvar
;
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.