DATA Step, Macro, Functions and more

Do loop sum variables

Accepted Solution Solved
Reply
Contributor
Posts: 20
Accepted Solution

Do loop sum variables

[ Edited ]

Hello,

I bet this is easy but I've been failing for a while now so maybe someone can help me. Below is sample data.

 

ID H-Start H-End H_1 H_2 H_3 H_4 H_5 H_6 H_7 H_8 H_9 H_10 H_11 H_12
A 3 6 0 1 1 1 1 1 0 1 0 1 1 1
B 2 5 0 1 0 0 0 0 1 1 1 1 1 1
C 9 12 1 1 0 0 0 0 0 0 0 0 0

1

 

I need to sum the H_# for certain ranges.  The variable H-Start has the start H value and H-End has the end.  For ID  'A' I want to sum H_3 through H_6. I've been trying to write a do loop and macros to solve this but have been failing.  

I would like the output to look like this: 

ID H-Start H-End H_1 H_2 H_3 H_4 H_5 H_6 H_7 H_8 H_9 H_10 H_11 H_12 Want_Sum
1 3 6 0 1 1 1 1 1 0 1 0 1 1 1 4
2 2 5 0 1 0 0 0 0 1 1 1 1 1 1 1
3 9 12 1 1 0 0 0 0 0 0 0 0 0 1 1

Thank you!


Accepted Solutions
Solution
‎09-12-2017 03:14 PM
Super User
Posts: 23,357

Re: Do loop sum variables

data want;
set have;

array _h(*) h_1 - h_12;

total_sum = 0;

do i=h_start to H_end;
total_sum = sum(total_sum, _h(i));
end;

run;

View solution in original post


All Replies
PROC Star
Posts: 1,605

Re: Do loop sum variables

data have;

input ID $ H_Start    H_End H_1  H_2  H_3  H_4  H_5  H_6  H_7  H_8  H_9  H_10     H_11 H_12;

datalines;

A    3    6    0    1    1    1    1    1    0    1    0    1    1    1

B    2    5    0    1    0    0    0    0    1    1    1    1    1    1

C    9    12   1    1    0    0    0    0    0    0    0    0    0    1

;

 

data want;

set have;

if id='A' then sum=sum(of h_3--h_6);

run;

Contributor
Posts: 20

Re: Do loop sum variables

Posted in reply to novinosrin
Hi,
Sorry, maybe I wasn't clear. I want to add a variable to sum the specific H_# for each person. My real data has about 30K ID's and 60 H values.
Thanks.
PROC Star
Posts: 1,605

Re: Do loop sum variables

Please provide a sample of your output table. Thank you

Contributor
Posts: 20

Re: Do loop sum variables

[ Edited ]
Posted in reply to novinosrin

I added my wanted output to the original post.

Super User
Posts: 6,644

Re: Do loop sum variables

With 60 H variables:

 

data want;

set have;

array h_ {60};

total_h = 0;

if (1 <= h_start <= 60) and (1 <= h_end <= 60) then do _n_=h_start to h_end;

   total_h + h_{_n_};

end;

run;

Contributor
Posts: 20

Re: Do loop sum variables

Posted in reply to Astounding
I'm getting the error message (biop_pr is my var name for start number and biop is my variable for the stop value).
NOTE: Character values have been converted to numeric values at the places given by:
(Line)Smiley SadColumn).
613:14
NOTE: Variable hmo_biop_pr is uninitialized.
NOTE: Variable hmo_biop is uninitialized.
Super User
Posts: 6,644

Re: Do loop sum variables

This error message says you got it wrong.  Your data set does not contain variables named HMO_BIOP_PR or HMO_BIOP.  Maybe you spelled the variable names wrong.  Or maybe you chose the wrong data set.  But those variables don't exist in the data set that you are using.

 

If possible, it would be more helpful if you were to post the log so we can see any errors in context.

Contributor
Posts: 20

Re: Do loop sum variables

Posted in reply to Astounding
Sorry I was using my ready data H_# is hmo_# and biop_pr is for the H-start and biop is for the H-stop.
Solution
‎09-12-2017 03:14 PM
Super User
Posts: 23,357

Re: Do loop sum variables

data want;
set have;

array _h(*) h_1 - h_12;

total_sum = 0;

do i=h_start to H_end;
total_sum = sum(total_sum, _h(i));
end;

run;
Contributor
Posts: 20

Re: Do loop sum variables

[ Edited ]

I'm confused. For every ID there is a different start and stop H variables. For "do i=h_start to H_end;" What do I put in the '_start' and the '_end'?

 

Edit again to add the post I was referring to. 

data want;
set have;

array _h(*) h_1 - h_12;

total_sum = 0;

do i=h_start to H_end;
total_sum = sum(total_sum, _h(i));
end;

run;

 

Edited to add:

For A I need it to sum H_3 throught H_6, for B I need it to add H_2 though H_5, and C would be H_9 through H_12.  

The variable H-Start is the number of H_# I want it to begin summing and the variable H-end has the H I want it to stop at.

Super User
Posts: 23,357

Re: Do loop sum variables


weweaw wrote:
I'm confused. For every ID there is a different start and stop H variables. 

 

That's why you use the variable names.

 


weweaw wrote:
. For "do i=h_start to H_end;" What do I put in the '_start' and the '_end'?

There are several solutions so we don' tknow which you're referring to. 

It's best if you post your full code and log if you're having issues. 

Contributor
Posts: 20

Re: Do loop sum variables

Ok, I see where I changed it wrong. No error but total_sum is 0 for everything now and the other new variable i has odd numbers that my data could not sum to. 

data t3;
set try1;

array _h(*) hmo_1 - hmo_60;

total_sum = 0;

do i=biop_pr to biop;
total_sum = sum(total_sum, _h(i));
end;

run;

Super User
Posts: 6,644

Re: Do loop sum variables

The variable i can be ignored.  It will always be equal to BIOP + 1.

 

There are a couple of primary possibilities to explain your results.  First and foremost, are the variable names HMO_1, HMO_2, etc. the proper names of existing variables in your data set?  (Is it possible they are spelled incorrectly and should really be H_1, H_2, etc.?)  And secondly, if you do have the proper names, are they always either 0 or a missing value for the relevant range of variables (from BIOP_PR to BIOP)?

Super User
Posts: 23,357

Re: Do loop sum variables

Ok...given that it works for the sample data it's a bit hard to say what's wrong with yours. 

 

1. Check correct variable mappings in all places

2. Check array definition - type out all H_ to test if necessary

 

data have;
input ID $ H_Start    H_End H_1  H_2  H_3  H_4  H_5  H_6  H_7  H_8  H_9  H_10     H_11 H_12;
datalines;
A    3    6    0    1    1    1    1    1    0    1    0    1    1    1
B    2    5    0    1    0    0    0    0    1    1    1    1    1    1
C    9    12   1    1    0    0    0    0    0    0    0    0    0    1
;

data want;
set have;

array _h(*) h_1 - h_12;

do i=h_start to h_end;
 total_sum = sum(total_sum, _h(i));
end;

run;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 14 replies
  • 630 views
  • 4 likes
  • 4 in conversation