Summing across rows according to desired condition

Accepted Solution Solved
Reply
Contributor
Posts: 35
Accepted Solution

Summing across rows according to desired condition

Dear All,

I have the following dataset:

data have;

     input ID1 ID2 V1 V2 V3;

datalines;

101     10001     2.1     1.7     3.1

101     10001     3.0     1.7     3.1

101     10002     5.9     2.3     8.5

102     10003     1.9     4.4     2.8

102     10003     1.0     4.4     2.8

102     10003     2.1     4.4     2.8

103     10004     3.7     5.0     7.4

;

I would like to sum the values of the variable V1 across rows only when records have same ID1 and ID2. That is, I would like to obtain what follows:


data want;

     input ID1 ID2 V1 V2 V3;

datalines;

101     10001     5.1     1.7     3.1

101     10002     5.9     2.3     8.5

102     10003     5.0     4.4     2.8

103     10004     3.7     5.0     7.4

;

Notice that the values of V1 in want are given by: 5.1 = 2.1+3.0 and 5.0=1.9+1.0+2.1.


Any help would be highly appreciated.


Accepted Solutions
Solution
‎10-01-2014 10:23 AM
Super User
Super User
Posts: 7,720

Re: Summing across rows according to desired condition

Hi,

Actually you could do:

proc sql;

     create table WANT as

     select     distinct

                    ID1,

                    ID2,

                    sum(V1) as V1,

                    sum(V2) as V2,

                    sum(V3) as V3

     from        HAVE

     group by ID1||ID2;

quit;

Well, one way with retain, have done this quickly, you could do by id1 id2, I combined as don't really have time to think about right now.  you could also use arrays for sum.

data inter;

     set have;

     length tot $200;

run;

data want; /* Assumed sorted by id1 and id2 */

     set inter;

     by tot;

     retain sum1-sum3;

     if first.tot then do;

          sum1=v1; sum2=v2; sum3=v3;

     end;

     else do;

          sum1=sum1+v1;

          sum2=sum2+v2;

          sum3=sum3+v3;

     end;

     if last.tot then output;

run'

View solution in original post


All Replies
Solution
‎10-01-2014 10:23 AM
Super User
Super User
Posts: 7,720

Re: Summing across rows according to desired condition

Hi,

Actually you could do:

proc sql;

     create table WANT as

     select     distinct

                    ID1,

                    ID2,

                    sum(V1) as V1,

                    sum(V2) as V2,

                    sum(V3) as V3

     from        HAVE

     group by ID1||ID2;

quit;

Well, one way with retain, have done this quickly, you could do by id1 id2, I combined as don't really have time to think about right now.  you could also use arrays for sum.

data inter;

     set have;

     length tot $200;

run;

data want; /* Assumed sorted by id1 and id2 */

     set inter;

     by tot;

     retain sum1-sum3;

     if first.tot then do;

          sum1=v1; sum2=v2; sum3=v3;

     end;

     else do;

          sum1=sum1+v1;

          sum2=sum2+v2;

          sum3=sum3+v3;

     end;

     if last.tot then output;

run'

Contributor
Posts: 35

Re: Summing across rows according to desired condition

Thank you @RW9 for your help. I think there are some typos in your first version with PROC SQL. I'm posting a corrected version of your code:

proc sql;

     create table WANT as

     select     distinct ID1, ID2, sum(V1) as V1, V2, V3

     from        HAVE

     group by ID1, ID2;

quit;

Super User
Super User
Posts: 7,720

Re: Summing across rows according to desired condition

Nope, no typos there.  You want three variables out V1 = sum of V1, V2 = sum of V2, V3 = sum of V3.

You do this by:

sum(variable) as new_variable.

So each one needs specifying, per my original SQL.

Contributor
Posts: 35

Re: Summing across rows according to desired condition

I want 3 variables out, but I want to sum only with respect to variable V1 across rows. Your code is also summing V2 and V3 across rows.

Super User
Posts: 9,874

Re: Summing across rows according to desired condition

Mark,

Don't piss at RW9 Smiley Happy, because your question is really not easy .

data have;
     input ID1 ID2 V1 V2 V3;
datalines;
101     10001     2.1     1.7     3.1
101     10001     3.0     1.7     3.1
101     10002     5.9     2.3     8.5
102     10003     1.9     4.4     2.8
102     10003     1.0     4.4     2.8
102     10003     2.1     4.4     2.8
103     10004     3.7     5.0     7.4
;
run;
proc sql;
create table want as
select id1,id2, case when range(v1)=0 then avg(v1) else sum(v1) end as v1,
                 case when range(v2)=0 then avg(v2) else sum(v2) end as v2,
                     case when range(v3)=0 then avg(v3) else sum(v3) end as v3
 from have
  group by id1,id2;
quit;

Xia Keshan

Contributor
Posts: 35

Re: Summing across rows according to desired condition

Dear , I'm not pissing at @RW9. I immediately thanked him for his help and selected his answer as the correct one. I just posted a different version of the code for future users. :smileygrin:

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 292 views
  • 3 likes
  • 3 in conversation