Hello,
I'm attempting to calculate sums across rows. I would need to sometimes calculate the sum of all rows, while other times calculate the sum of only two or three rows. How can I do this? An example of my data is below.
Thanks!
data WORK.HOURS;
input Director Engineers TeamLeads AdditionalStaff;
datalines;
60 55 20 52
89 22 74 78
78 11 69 47
77 24 59 25
;
run;
@AVUH777 wrote:
Basically, I would need to add up hours worked between rows. For example, I would need to add 60+55+20 for the first row, while for the second row, I would need to add 89 + 22.
You and I obviously understand the words "across" and "between" differently.
Nevertheless, why in the second row is it 89+22 and NOT 89+22+74+78?
The sum function will add values in a row.
sum_of_row=sum(Director,Engineers,TeamLeads,AdditionalStaff);
Could you give an example of how you would like the output to look?
@AVUH777 wrote:
Basically, I would need to add up hours worked between rows. For example, I would need to add 60+55+20 for the first row, while for the second row, I would need to add 89 + 22.
You and I obviously understand the words "across" and "between" differently.
Nevertheless, why in the second row is it 89+22 and NOT 89+22+74+78?
The sum function will add values in a row.
sum_of_row=sum(Director,Engineers,TeamLeads,AdditionalStaff);
If you have variables that contain what would describe the "groups of rows" to sum then proc summary will do this. Best practice is likely to have one or more variables that indicate which groups you want to add.
Example. Note that I added a couple of variables to identify groups. When you examine the output data set you will see combinations of the variables that appear on the CLASS statements along with a variable _type_ that identifies a specific combination. The first record, _type_=0 will have a summary of ALL the records. Then depending on the order of variables on the class statements the groups of each of the other variables taken one at a time, then two at a time .....
data WORK.HOURS; input region $ branch $ Director Engineers TeamLeads AdditionalStaff; datalines; AAA bbb 60 55 20 52 AAA bbb 89 22 74 78 AAA ccc 89 22 74 78 CCC ddd 11 69 47 CCC eee 77 24 59 25 CCC eee 7 4 5 2 ; run; proc summary data=work.hours; class region branch; var Director Engineers TeamLeads AdditionalStaff; output out=summary Sum= ; run;
You can select specific values of _type_ for conditional processing. There is also a variable _freq_ that has the number of observations for each row.
I have used this with up to 9 variables to summarize all the variables I needed an then selected which bits go into which part of a report by using the _type_ variable and a Where statement.
@AVUH777 wrote:
I would need to sometimes calculate the sum of all rows, while other times calculate the sum of only two or three rows. How can I do this? An example of my data is below.
The above in red needs a lot more explanation. Also, I assume you mean sum of a variable (down a column, not across rows), is that right?
Basically, I would need to add up hours worked between rows. For example, for the first row, I would need to add 60+55+20, while for the second row, I would need to add 89 + 22.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.