Hi,
I have question about calculating cumulative mean. I want to calculate the cumulative mean of a column but want to insert classification.
time_period | ID | income |
1 | 1 | 50 |
2 | 1 | 43 |
3 | 1 | 12 |
4 | 1 | 34 |
So i used a simple code, as follows,
DATA average ;
SET data;
total +income;
average = total
run;
Take a look at the result sbelow. The first table show what i get when i run the data stpe by SASkiwin above. But i want tell SAS that it has to repeat the same thing for difference classes within the same dataset. The BY variable did not help! Any suggestions guys? I am relatively new to SAS!
time_period | ID | income | total | average |
1 | 1 | 50 | 50 | 50 |
2 | 1 | 43 | 93 | 71.5 |
3 | 1 | 12 | 105 | 82.66667 |
4 | 1 | 34 | 139 | 96.75 |
1 | 2 | 60 | 199 | 117.2 |
2 | 2 | 21 | 220 | 134.3333 |
3 | 2 | 34 | 254 | 151.4286 |
4 | 2 | 12 | 266 | 165.75 |
this is what I want (below) | ||||
time_period | ID | income | total | average |
1 | 1 | 50 | 50 | 50 |
2 | 1 | 43 | 93 | 71.5 |
3 | 1 | 12 | 105 | 82.66667 |
4 | 1 | 34 | 139 | 96.75 |
1 | 2 | 60 | 34 | 34 |
2 | 2 | 21 | 81 | 57.5 |
3 | 2 | 34 | 115 | 76.66667 |
4 | 2 | 12 | 127 | 89.25 |
Jessica
Hi,
Maybe best to close out one of the posts.
As for your point:
data average ;
set data;
by id;
retain average total n;
if first.id then do;
n=1;
total=income;
average=income;
end;
else do;
n=n+1;
total = total + income;
average = total/n;
end;
run;
Something like that, though I haven't tested it. Though as per other post, you could also SQL it:
data have;
attrib id time_period income format=best.;
infile datalines delimiter=",";
input id time_period income;
datalines;
1,1,45
1,2,32
1,3,67
1,4,34
2,1,23
2,2,89
2,3,78
2,4,10
;
run;
proc sql;
create table WANT as
select A.*,
(select SUM(INCOME) from WORK.HAVE where ID=A.ID and TIME_PERIOD <= A.TIME_PERIOD) as TOTAL,
CALCULATED TOTAL / (select COUNT(ID) from WORK.HAVE where ID=A.ID and TIME_PERIOD <= A.TIME_PERIOD) as CUMULATIVE_AVG
from HAVE A;
quit;
Hi,
Well you could use a retain to retain current average, and then reset that at the beginning of the group:
data average ;
set data;
by id;
retain average total;
if first.id then do;
total=income;
average=income;
end;
else do;
total = total + income;
average = ...;
end;
run;
Hi,
I am getting an error if i run this,. The problem is i need a 'n' that corresponds a cumulative number from each unique ID, so that i can divide it with the total income calculated fro unique ID/
data average ;
set data;
by id;
retain average total;
if first.id then do;
total=income;
average=income;
end;
else do;
total = total + income;
average = total/_n_ ;
end;
run;
Could you post an answer here, the other post i wasn't really clear with the problem. The code you gave me worked, all i need is some clarification on your last line "average = ...; " could you expand this?
Hi,
Maybe best to close out one of the posts.
As for your point:
data average ;
set data;
by id;
retain average total n;
if first.id then do;
n=1;
total=income;
average=income;
end;
else do;
n=n+1;
total = total + income;
average = total/n;
end;
run;
Something like that, though I haven't tested it. Though as per other post, you could also SQL it:
data have;
attrib id time_period income format=best.;
infile datalines delimiter=",";
input id time_period income;
datalines;
1,1,45
1,2,32
1,3,67
1,4,34
2,1,23
2,2,89
2,3,78
2,4,10
;
run;
proc sql;
create table WANT as
select A.*,
(select SUM(INCOME) from WORK.HAVE where ID=A.ID and TIME_PERIOD <= A.TIME_PERIOD) as TOTAL,
CALCULATED TOTAL / (select COUNT(ID) from WORK.HAVE where ID=A.ID and TIME_PERIOD <= A.TIME_PERIOD) as CUMULATIVE_AVG
from HAVE A;
quit;
thanks a lot! that works perfectly!
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.