Hello,
I bet this is easy but I've been failing for a while now so maybe someone can help me. Below is sample data.
ID | H-Start | H-End | H_1 | H_2 | H_3 | H_4 | H_5 | H_6 | H_7 | H_8 | H_9 | H_10 | H_11 | H_12 |
A | 3 | 6 | 0 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 0 | 1 | 1 | 1 |
B | 2 | 5 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 1 |
C | 9 | 12 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
1 |
I need to sum the H_# for certain ranges. The variable H-Start has the start H value and H-End has the end. For ID 'A' I want to sum H_3 through H_6. I've been trying to write a do loop and macros to solve this but have been failing.
I would like the output to look like this:
ID | H-Start | H-End | H_1 | H_2 | H_3 | H_4 | H_5 | H_6 | H_7 | H_8 | H_9 | H_10 | H_11 | H_12 | Want_Sum |
1 | 3 | 6 | 0 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 0 | 1 | 1 | 1 | 4 |
2 | 2 | 5 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
3 | 9 | 12 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |
Thank you!
data want;
set have;
array _h(*) h_1 - h_12;
total_sum = 0;
do i=h_start to H_end;
total_sum = sum(total_sum, _h(i));
end;
run;
data have;
input ID $ H_Start H_End H_1 H_2 H_3 H_4 H_5 H_6 H_7 H_8 H_9 H_10 H_11 H_12;
datalines;
A 3 6 0 1 1 1 1 1 0 1 0 1 1 1
B 2 5 0 1 0 0 0 0 1 1 1 1 1 1
C 9 12 1 1 0 0 0 0 0 0 0 0 0 1
;
data want;
set have;
if id='A' then sum=sum(of h_3--h_6);
run;
Please provide a sample of your output table. Thank you
I added my wanted output to the original post.
With 60 H variables:
data want;
set have;
array h_ {60};
total_h = 0;
if (1 <= h_start <= 60) and (1 <= h_end <= 60) then do _n_=h_start to h_end;
total_h + h_{_n_};
end;
run;
This error message says you got it wrong. Your data set does not contain variables named HMO_BIOP_PR or HMO_BIOP. Maybe you spelled the variable names wrong. Or maybe you chose the wrong data set. But those variables don't exist in the data set that you are using.
If possible, it would be more helpful if you were to post the log so we can see any errors in context.
data want;
set have;
array _h(*) h_1 - h_12;
total_sum = 0;
do i=h_start to H_end;
total_sum = sum(total_sum, _h(i));
end;
run;
I'm confused. For every ID there is a different start and stop H variables. For "do i=h_start to H_end;" What do I put in the '_start' and the '_end'?
Edit again to add the post I was referring to.
data want;
set have;
array _h(*) h_1 - h_12;
total_sum = 0;
do i=h_start to H_end;
total_sum = sum(total_sum, _h(i));
end;
run;
Edited to add:
For A I need it to sum H_3 throught H_6, for B I need it to add H_2 though H_5, and C would be H_9 through H_12.
The variable H-Start is the number of H_# I want it to begin summing and the variable H-end has the H I want it to stop at.
@weweaw wrote:
I'm confused. For every ID there is a different start and stop H variables.
That's why you use the variable names.
@weweaw wrote:
. For "do i=h_start to H_end;" What do I put in the '_start' and the '_end'?
There are several solutions so we don' tknow which you're referring to.
It's best if you post your full code and log if you're having issues.
Ok, I see where I changed it wrong. No error but total_sum is 0 for everything now and the other new variable i has odd numbers that my data could not sum to.
data t3;
set try1;
array _h(*) hmo_1 - hmo_60;
total_sum = 0;
do i=biop_pr to biop;
total_sum = sum(total_sum, _h(i));
end;
run;
The variable i can be ignored. It will always be equal to BIOP + 1.
There are a couple of primary possibilities to explain your results. First and foremost, are the variable names HMO_1, HMO_2, etc. the proper names of existing variables in your data set? (Is it possible they are spelled incorrectly and should really be H_1, H_2, etc.?) And secondly, if you do have the proper names, are they always either 0 or a missing value for the relevant range of variables (from BIOP_PR to BIOP)?
Ok...given that it works for the sample data it's a bit hard to say what's wrong with yours.
1. Check correct variable mappings in all places
2. Check array definition - type out all H_ to test if necessary
data have;
input ID $ H_Start H_End H_1 H_2 H_3 H_4 H_5 H_6 H_7 H_8 H_9 H_10 H_11 H_12;
datalines;
A 3 6 0 1 1 1 1 1 0 1 0 1 1 1
B 2 5 0 1 0 0 0 0 1 1 1 1 1 1
C 9 12 1 1 0 0 0 0 0 0 0 0 0 1
;
data want;
set have;
array _h(*) h_1 - h_12;
do i=h_start to h_end;
total_sum = sum(total_sum, _h(i));
end;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.