@hk2013 wrote:
is it possible to do this by id and another variable
data have; input id var2 $ date:date9. num_1 Num_2; format date date9.; datalines;
1 jj06JAN2019 5 10 1 JJ 13JAN2019 5 10 1 kk 20JAN2019 5 10 1 kk 27JAN2019 5 10
2 gg 06JAN2019 4 2 2 gg 13JAN2019 4 2 2 hh 20JAN2019 4 2 2 hh 27JAN2019 4 2
;
output :
id var2 date num_1 num_2 total
1 jj 06jan2019 5 10 15
1 jj 13jan2019 5 10 25
1 kk 22jan2019 5 10 15
1 kk27jan2019 5 10 25
2 gg 06jan2019 4 2 6
2 gg 13jan2019 4 2 8
2 hh 22jan2019 4 2 6
2 hh 27jan2019 4 2 8
Please describe the rule(s) involved.
I am guessing that for the first occurrence of ID and VAR2 that total is the sum of num_1 and Num_2 but for the following values the total only accumulates Num_2 into the total. Would that be correct? If so
data have;
input id var2 $ date:date9. num_1 Num_2;
format date date9.;
datalines;
1 jj 06JAN2019 5 10
1 jj 13JAN2019 5 10
1 kk 20JAN2019 5 10
1 kk 27JAN2019 5 10
2 gg 06JAN2019 4 2
2 gg 13JAN2019 4 2
2 hh 20JAN2019 4 2
2 hh 27JAN2019 4 2
;
run;
proc sort data=have;
by id var2 date;
run;
data want;
set have;
by id var2;
retain total;
if first.var2 then total= sum(num_1,Num_2);
else total = sum(total,num_2);
run;
NOTE: Your example input data changes case for jj and JJ so the sort order would be incorrect. I added an explicit sort but if your data shouldn't change order then you could use the by notsorted in the Want data set but you could have issues depending on your data and the actual desired outcome.
Also you cannot have data on the same line as the DATALINES statement.