- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi, All,
I've noticed a difference between these two syntaxes and I was wondering if somebody could explain the difference to me.
If I run the following:
proc sql;
create table output
as select id,
sub_id,
sum(col_x+col_y) as col_z
from input
group by id
;
quit;
col_z will be summed across the rows AND for the grouped variable. However, if I run this proc sql with sum(col_x,col_y) then the sum will be calculated across the rows but NOT for the grouped variable. So why is there a difference between these two?
Thanks in advance for any help you can offer.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You are using two different functions. The normal SAS function SUM(,) takes two or more arguments and will add them together (ignoring missing values). There many other similar functions that SAS has for this.
The SQL aggregate function SUM() will aggregate, in this case sum, its single argument across all observations (or all observations within a group if your statement includes a GROUP BY clause).
Looks like you want to use BOTH of them.
select sum( sum(x,y) ) as Total_X_and_Y
from have
;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You are using two different functions. The normal SAS function SUM(,) takes two or more arguments and will add them together (ignoring missing values). There many other similar functions that SAS has for this.
The SQL aggregate function SUM() will aggregate, in this case sum, its single argument across all observations (or all observations within a group if your statement includes a GROUP BY clause).
Looks like you want to use BOTH of them.
select sum( sum(x,y) ) as Total_X_and_Y
from have
;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks, Tom. That clears it up.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This is how sum function works-
Sum (1, 2, ., 3) = 6, ==> means when we use ' , ' in sum function it ignores the missing value and sum the other values.
Sum (1 + 2 + . +3) = . (missing), ==> means when we use '+' in sum function it considers 'missing + Value = missing' .
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@vishalrajpoot3 wrote:
This is how sum function works-
Sum (1, 2, ., 3) = 6, ==> means when we use ' , ' in sum function it ignores the missing value and sum the other values.
Sum (1 + 2 + . +3) = . (missing), ==> means when we use '+' in sum function it considers 'missing + Value = missing' .
You are talking about the difference between the SUM(,) function and just simple addition.
sum(x,y,z) vs x+y+z
If you use your second expression inside of PROC SQL then it will use the SQL aggregate function SUM() instead of the SAS function SUM(,) that the first expression is using. If you want to force it to use the SAS function then add a second argument.
sum(x+y+z,.)
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content