- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I bet this is easy but I've been failing for a while now so maybe someone can help me. Below is sample data.
ID | H-Start | H-End | H_1 | H_2 | H_3 | H_4 | H_5 | H_6 | H_7 | H_8 | H_9 | H_10 | H_11 | H_12 |
A | 3 | 6 | 0 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 0 | 1 | 1 | 1 |
B | 2 | 5 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 1 |
C | 9 | 12 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
1 |
I need to sum the H_# for certain ranges. The variable H-Start has the start H value and H-End has the end. For ID 'A' I want to sum H_3 through H_6. I've been trying to write a do loop and macros to solve this but have been failing.
I would like the output to look like this:
ID | H-Start | H-End | H_1 | H_2 | H_3 | H_4 | H_5 | H_6 | H_7 | H_8 | H_9 | H_10 | H_11 | H_12 | Want_Sum |
1 | 3 | 6 | 0 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 0 | 1 | 1 | 1 | 4 |
2 | 2 | 5 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
3 | 9 | 12 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |
Thank you!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data want;
set have;
array _h(*) h_1 - h_12;
total_sum = 0;
do i=h_start to H_end;
total_sum = sum(total_sum, _h(i));
end;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data have;
input ID $ H_Start H_End H_1 H_2 H_3 H_4 H_5 H_6 H_7 H_8 H_9 H_10 H_11 H_12;
datalines;
A 3 6 0 1 1 1 1 1 0 1 0 1 1 1
B 2 5 0 1 0 0 0 0 1 1 1 1 1 1
C 9 12 1 1 0 0 0 0 0 0 0 0 0 1
;
data want;
set have;
if id='A' then sum=sum(of h_3--h_6);
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Sorry, maybe I wasn't clear. I want to add a variable to sum the specific H_# for each person. My real data has about 30K ID's and 60 H values.
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Please provide a sample of your output table. Thank you
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I added my wanted output to the original post.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
With 60 H variables:
data want;
set have;
array h_ {60};
total_h = 0;
if (1 <= h_start <= 60) and (1 <= h_end <= 60) then do _n_=h_start to h_end;
total_h + h_{_n_};
end;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
NOTE: Character values have been converted to numeric values at the places given by:
(Line):(Column).
613:14
NOTE: Variable hmo_biop_pr is uninitialized.
NOTE: Variable hmo_biop is uninitialized.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This error message says you got it wrong. Your data set does not contain variables named HMO_BIOP_PR or HMO_BIOP. Maybe you spelled the variable names wrong. Or maybe you chose the wrong data set. But those variables don't exist in the data set that you are using.
If possible, it would be more helpful if you were to post the log so we can see any errors in context.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data want;
set have;
array _h(*) h_1 - h_12;
total_sum = 0;
do i=h_start to H_end;
total_sum = sum(total_sum, _h(i));
end;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I'm confused. For every ID there is a different start and stop H variables. For "do i=h_start to H_end;" What do I put in the '_start' and the '_end'?
Edit again to add the post I was referring to.
data want;
set have;
array _h(*) h_1 - h_12;
total_sum = 0;
do i=h_start to H_end;
total_sum = sum(total_sum, _h(i));
end;
run;
Edited to add:
For A I need it to sum H_3 throught H_6, for B I need it to add H_2 though H_5, and C would be H_9 through H_12.
The variable H-Start is the number of H_# I want it to begin summing and the variable H-end has the H I want it to stop at.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@weweaw wrote:
I'm confused. For every ID there is a different start and stop H variables.
That's why you use the variable names.
@weweaw wrote:
. For "do i=h_start to H_end;" What do I put in the '_start' and the '_end'?
There are several solutions so we don' tknow which you're referring to.
It's best if you post your full code and log if you're having issues.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Ok, I see where I changed it wrong. No error but total_sum is 0 for everything now and the other new variable i has odd numbers that my data could not sum to.
data t3;
set try1;
array _h(*) hmo_1 - hmo_60;
total_sum = 0;
do i=biop_pr to biop;
total_sum = sum(total_sum, _h(i));
end;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The variable i can be ignored. It will always be equal to BIOP + 1.
There are a couple of primary possibilities to explain your results. First and foremost, are the variable names HMO_1, HMO_2, etc. the proper names of existing variables in your data set? (Is it possible they are spelled incorrectly and should really be H_1, H_2, etc.?) And secondly, if you do have the proper names, are they always either 0 or a missing value for the relevant range of variables (from BIOP_PR to BIOP)?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Ok...given that it works for the sample data it's a bit hard to say what's wrong with yours.
1. Check correct variable mappings in all places
2. Check array definition - type out all H_ to test if necessary
data have;
input ID $ H_Start H_End H_1 H_2 H_3 H_4 H_5 H_6 H_7 H_8 H_9 H_10 H_11 H_12;
datalines;
A 3 6 0 1 1 1 1 1 0 1 0 1 1 1
B 2 5 0 1 0 0 0 0 1 1 1 1 1 1
C 9 12 1 1 0 0 0 0 0 0 0 0 0 1
;
data want;
set have;
array _h(*) h_1 - h_12;
do i=h_start to h_end;
total_sum = sum(total_sum, _h(i));
end;
run;