BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
boyandhisrobot
Calcite | Level 5

Hi, All,

 

I've noticed a difference between these two syntaxes and I was wondering if somebody could explain the difference to me.

If I run  the following:

 

proc sql;
create table output
as select id,
sub_id,
sum(col_x+col_y) as col_z
from input
group by id
;
quit;

 

col_z will be summed across the rows AND for the grouped variable.  However, if I run this proc sql with sum(col_x,col_y) then the sum will be calculated across the rows but NOT for the grouped variable.  So why is there a difference between these two?

 

Thanks in advance for any help you can offer.

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

You are using two different functions. The normal SAS function SUM(,) takes two or more arguments and will add them together (ignoring missing values).  There many other similar functions that SAS has for this.

 

The SQL aggregate function SUM() will aggregate, in this case sum,  its single argument across all observations (or all observations within a group if your statement includes a GROUP BY clause).

 

Looks like you want to use BOTH of them.

 

select sum( sum(x,y) ) as Total_X_and_Y
from have
;

View solution in original post

5 REPLIES 5
Tom
Super User Tom
Super User

You are using two different functions. The normal SAS function SUM(,) takes two or more arguments and will add them together (ignoring missing values).  There many other similar functions that SAS has for this.

 

The SQL aggregate function SUM() will aggregate, in this case sum,  its single argument across all observations (or all observations within a group if your statement includes a GROUP BY clause).

 

Looks like you want to use BOTH of them.

 

select sum( sum(x,y) ) as Total_X_and_Y
from have
;
boyandhisrobot
Calcite | Level 5

Thanks, Tom.  That clears it up.

vishalrajpoot3
Obsidian | Level 7

 

This is how sum function works-

 

Sum (1, 2, ., 3) = 6,    ==> means when we use ' , ' in sum function it ignores the missing value and sum the other values.

Sum (1 + 2 + . +3) = . (missing),    ==> means when we use '+' in sum function it considers 'missing + Value = missing' .

 

 

Tom
Super User Tom
Super User

@vishalrajpoot3 wrote:

 

This is how sum function works-

 

Sum (1, 2, ., 3) = 6,    ==> means when we use ' , ' in sum function it ignores the missing value and sum the other values.

Sum (1 + 2 + . +3) = . (missing),    ==> means when we use '+' in sum function it considers 'missing + Value = missing' .

 

 


You are talking about the difference between the SUM(,) function and just simple addition. 

sum(x,y,z)
vs
x+y+z

If you use your second expression inside of PROC SQL then it will use the SQL aggregate function SUM() instead of the SAS function SUM(,) that the first expression is using.  If you want to force it to use the SAS function then add a second argument.

sum(x+y+z,.)
vishalrajpoot3
Obsidian | Level 7
yes Tom, you are right, it shows only difference. I am at bragging stage of learning SAS, so I just tried to give a part of solution.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 4405 views
  • 0 likes
  • 3 in conversation