Hi all,
I am trying to sum the cost to get the variable TOTAL (it only has ID and cost in the original table). Please see below. Thank you in advance!
ID COST TOTAL
1 10 100
2 40 100
3 50 100
Hello @di_niu0
A very simple approach to solve your issues is to use Proc SQL. The function count(cost) sums the cost and makes it available in all rows. Internally behind the scene it makes two passes as described as described by @mkeintz but a user need not be worried about it.
data have;
input id cost;
datalines;
1 10
2 40
3 50
;
run;
proc sql;
create table want as
select *, sum(cost) as total
from have;
quit;
There will be a message in the log like this indicating that the some was calculated and merged .
NOTE: The query requires remerging summary statistics back with the original data
The output will be what you wanted.
If you want to produce this with a data step, you need it to pass through the data set twice: The first time to establish the total, and the second time to reread and output each obs with the total established in the first pass.
So here's the first two statements to do that:
data want;
set have (in=firstpass) have (in=secondpass);
...
...
run;
The SET statement above reads ALL the observations with the FIRSTPASS=1 condition (and SECONDPASS=0), and then rereads them with SECONDPASS=1 and FIRSTPASS=0. So you have dummy variables available to determine whether the observation-hand is from the first pass or the second pass.
So write code after the SET that creates a total during the first pass, and a filter that allows output only during the second pass.
Hello @di_niu0
A very simple approach to solve your issues is to use Proc SQL. The function count(cost) sums the cost and makes it available in all rows. Internally behind the scene it makes two passes as described as described by @mkeintz but a user need not be worried about it.
data have;
input id cost;
datalines;
1 10
2 40
3 50
;
run;
proc sql;
create table want as
select *, sum(cost) as total
from have;
quit;
There will be a message in the log like this indicating that the some was calculated and merged .
NOTE: The query requires remerging summary statistics back with the original data
The output will be what you wanted.
Do you really have to sum the costs of different ids? This seems strange.
Please use data step if possible.
For anyone else who doesn't have this unfortunate restriction, use the SQL solution from @Sajid01
For @di_niu0 who seems to require a data step, why??? In SAS, there are many ways to accomplish something, and sometimes you can have better solutions by removing this restriction.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.