## Summing vars based on values of another variable

Frequent Contributor
Posts: 138

# Summing vars based on values of another variable

Hi,

I have a dataset with 5 variables--ID, a group variable that can be anywhere from -50 to +50, and three count variables. The dataset is at the person/group level (so each ID could occur 100 times if a person has a count for every group):

ID     group_var     count_var1     count_var2     count_var3

1            -50                0                     1                    3

1            -45                2                     1                    5

1            -1                  2                     0                    1

1             4                  4                     3                    2

1            12                 1                     0                    0

2           -25                 1                     2                    3

2            -7                  0                     0                    1

2            -2                  3                     4                    2

What I want is to obtain summative counts using values of group. So for each ID, I would want to know the sum of count_var1, count_var2, and count_var3, for group_var=-50 to -26, -25 to -1, +1 to +25, and +26 to +50. The output would ideally look like this:

ID     sum_count_var1_gp_neg_50_to_neg_26 sum_count_var2_gp_neg_50_to_neg_26     sum_count_var3_gp_neg_50_to_neg_26

1                                 2                                                              2                                                                 8

and continuing across with the sums for the other group ranges.

Any help is much appreciated!

Super User
Posts: 9,599

## Re: Summing vars based on values of another variable

Well, I would first assign the groupings in a new variable.  Then its just a matter of aggregation of that data.  In your logic you will miss out on 0 for group I would point out.  Also, do you really need the transposed output?   Its generally better to work with normalised data, and only transpose for outputs.

data have;

input ID     group_var     count_var1     count_var2     count_var3;

datalines;

1            -50                0                     1                    3

1            -45                2                     1                    5

1            -1                  2                     0                    1

1             4                  4                     3                    2

1            12                 1                     0                    0

2           -25                 1                     2                    3

2            -7                  0                     0                    1

2            -2                  3                     4                    2

;

run;

data inter;

set have;

length group \$20;

if -50 <= group_var <= -26 then group="-50 to -26";

if -25 <= group_var <= -1 then group="-25 to -1";

if 1 <= group_var <= 25 then group="1 to 25";

if 26 <= group_var <= 50 then group="26 to 50";

run;

proc sql;

create table WANT as

select  ID,

GROUP,

sum(COUNT_VAR1) as COUNT_1_SUM,

sum(COUNT_VAR2) as COUNT_2_SUM,

sum(COUNT_VAR3) as COUNT_3_SUM

from    INTER

group by ID,

GROUP;

quit;

Super User
Posts: 10,784

## Re: Summing vars based on values of another variable

### Code: Program

`data have;  input ID   group_var   count_var1   count_var2   count_var3;datalines;1 -50 0 1 31 -45 2 1 51 -1 2 0 11 4 4 3 21 12 1 0 0  2 -25 1 2 32 -7 0 0 12 -2 3 4 2;run;proc sql;create table want as select id,   (select sum(count_var1) from have where id=a.id and group_var between -50 and -26) as sum_count_var1_neg_50_to_neg_26,   (select sum(count_var2) from have where id=a.id and group_var between -50 and -26) as sum_count_var2_neg_50_to_neg_26,   (select sum(count_var3) from have where id=a.id and group_var between -50 and -26) as sum_count_var3_neg_50_to_neg_26  from (select distinct id from have) as a ;quit;`
Frequent Contributor
Posts: 138

## Re: Summing vars based on values of another variable

The proc sql solution yielded the correct output, but would you mind explaining to me exactly what the code is doing so I can understand how it works?

Thanks!

Super User
Posts: 13,583

## Re: Summing vars based on values of another variable

proc format;

value mygroup

-50 - -26 = "-50 to -26"

-25 - -1 ="-25 to -1"

1 - 25 ="+1 to +25"

26 - 50 = "+26 to +50"

other = "Out of Range"

;

run;

proc summary data=have nway;

class ID group;

format group mygroup. ;

var var1 var2 var3;

output out=sumset (drop = _type_ _freq_) sum=;

run;

Transpose the result if you need to.

Valued Guide
Posts: 765

## Re: Summing vars based on values of another variable

Hi, another idea ...

data want (keep = id tot;

array tot(12);

do j=1 to 12; tot(j)=0; end;

do until(last.id);

set have;

by id;

k = 3 * (ceil(group_var/25) + (group_var eq -50) + 1);

tot(k+1) + count_var1;

tot(k+2) + count_var2;

tot(k+3) + count_var3;

end;

run;

ID    tot1    tot2    tot3    tot4    tot5    tot6    tot7    tot8    tot9    tot10    tot11    tot12

1      2       2       8       2       0       1       5       3       2       0        0        0

2      1       2       3       3       4       3       0       0       0       0        0        0

Discussion stats
• 5 replies
• 291 views
• 0 likes
• 5 in conversation