DATA Step, Macro, Functions and more

Difference between sum(x+y) and sum(x,y)

Accepted Solution Solved
Reply
New Contributor
Posts: 2
Accepted Solution

Difference between sum(x+y) and sum(x,y)

Hi, All,

 

I've noticed a difference between these two syntaxes and I was wondering if somebody could explain the difference to me.

If I run  the following:

 

proc sql;
create table output
as select id,
sub_id,
sum(col_x+col_y) as col_z
from input
group by id
;
quit;

 

col_z will be summed across the rows AND for the grouped variable.  However, if I run this proc sql with sum(col_x,col_y) then the sum will be calculated across the rows but NOT for the grouped variable.  So why is there a difference between these two?

 

Thanks in advance for any help you can offer.

 

 


Accepted Solutions
Solution
‎07-19-2018 09:57 AM
Super User
Super User
Posts: 8,289

Re: Difference between sum(x+y) and sum(x,y)

Posted in reply to boyandhisrobot

You are using two different functions. The normal SAS function SUM(,) takes two or more arguments and will add them together (ignoring missing values).  There many other similar functions that SAS has for this.

 

The SQL aggregate function SUM() will aggregate, in this case sum,  its single argument across all observations (or all observations within a group if your statement includes a GROUP BY clause).

 

Looks like you want to use BOTH of them.

 

select sum( sum(x,y) ) as Total_X_and_Y
from have
;

View solution in original post


All Replies
Solution
‎07-19-2018 09:57 AM
Super User
Super User
Posts: 8,289

Re: Difference between sum(x+y) and sum(x,y)

Posted in reply to boyandhisrobot

You are using two different functions. The normal SAS function SUM(,) takes two or more arguments and will add them together (ignoring missing values).  There many other similar functions that SAS has for this.

 

The SQL aggregate function SUM() will aggregate, in this case sum,  its single argument across all observations (or all observations within a group if your statement includes a GROUP BY clause).

 

Looks like you want to use BOTH of them.

 

select sum( sum(x,y) ) as Total_X_and_Y
from have
;
New Contributor
Posts: 2

Re: Difference between sum(x+y) and sum(x,y)

Thanks, Tom.  That clears it up.

Occasional Contributor
Posts: 13

Re: Difference between sum(x+y) and sum(x,y)

Posted in reply to boyandhisrobot

 

This is how sum function works-

 

Sum (1, 2, ., 3) = 6,    ==> means when we use ' , ' in sum function it ignores the missing value and sum the other values.

Sum (1 + 2 + . +3) = . (missing),    ==> means when we use '+' in sum function it considers 'missing + Value = missing' .

 

 

Super User
Super User
Posts: 8,289

Re: Difference between sum(x+y) and sum(x,y)

[ Edited ]
Posted in reply to vishalrajpoot3

@vishalrajpoot3 wrote:

 

This is how sum function works-

 

Sum (1, 2, ., 3) = 6,    ==> means when we use ' , ' in sum function it ignores the missing value and sum the other values.

Sum (1 + 2 + . +3) = . (missing),    ==> means when we use '+' in sum function it considers 'missing + Value = missing' .

 

 


You are talking about the difference between the SUM(,) function and just simple addition. 

sum(x,y,z)
vs
x+y+z

If you use your second expression inside of PROC SQL then it will use the SQL aggregate function SUM() instead of the SAS function SUM(,) that the first expression is using.  If you want to force it to use the SAS function then add a second argument.

sum(x+y+z,.)
Occasional Contributor
Posts: 13

Re: Difference between sum(x+y) and sum(x,y)

yes Tom, you are right, it shows only difference. I am at bragging stage of learning SAS, so I just tried to give a part of solution.
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 71 views
  • 0 likes
  • 3 in conversation