I'm wondering what's the best way to create a variable that contains a column's total sum inside of a data step. Can this be easily achieved without proc sql?
As an example, I'd like to create a new variable, total_age, that contains the sum of all the students' ages, in the sashelp.class dataset. My code below creates a cumulative sum, but how would I revise to create a total so that each row displays the total age? Thank you.
data sas_intro;
set sashelp.class;
retain;
total_age = sum(total_age, age); /*creates cumulative sum, not total)*/
run;
A data step approach
data want;
if _N_ = 1 then do;
do until (z);
set sashelp.class end = z;
total_age = sum(total_age, age);
retain total_age;
end;
end;
set sashelp.class;
run;
It's either two steps in a data step (calculation + merge) or a DoW loop. Programmatically, SQL is much easier here for sure.
Depending on what you're doing, for example creating a report, the reporting procedure may be better suited to doing that as well.
data sas_intro;
do _n_ =1 by 1 until(eof);
set sashelp.class end=eof;
total_age = sum(total_age, age); /*creates cumulative sum, not total)*/
end;
do _n_ =1 by 1 until(eof2);
set sashelp.class end=eof2;
output;
end;
run;
Other methods illustrated here:
https://github.com/statgeek/SAS-Tutorials/blob/master/add_average_value_to_dataset.sas
An SQL approach
proc sql;
create table want as
select *
, sum(age) as total_age
from sashelp.class
;
quit;
A data step approach
data want;
if _N_ = 1 then do;
do until (z);
set sashelp.class end = z;
total_age = sum(total_age, age);
retain total_age;
end;
end;
set sashelp.class;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.