BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Jessica98
Calcite | Level 5

Hi,

I have question about calculating cumulative mean. I want to calculate the cumulative mean of a column but want to insert classification.

time_periodIDincome
1150
2143
3112
4134

So i used a simple code, as follows,

DATA average ;

SET data;

  total +income;

  average = total

run;

Take a look at the result sbelow. The first table show what i get when i run the data stpe by SASkiwin above. But i want tell SAS that it has to repeat the same thing for difference classes within the same dataset. The BY variable did not help! Any suggestions guys? I am relatively new to SAS!

time_periodIDincometotalaverage
11505050
21439371.5
311210582.66667
413413996.75
1260199117.2
2221220134.3333
3234254151.4286
4212266165.75
this is what I want (below)
time_periodIDincometotalaverage
11505050
21439371.5
311210582.66667
413413996.75
12603434
22218157.5
323411576.66667
421212789.25

Jessica

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

Maybe best to close out one of the posts.

As for your point:

data average ;

     set data;

     by id;

     retain  average total n;

     if first.id then do;

          n=1;

          total=income;

          average=income;

     end;

     else do;

          n=n+1;

          total = total + income;

          average = total/n;

     end;

run;

Something like that, though I haven't tested it.  Though as per other post, you could also SQL it:

data have;
  attrib id time_period income format=best.;
  infile datalines delimiter=",";
  input id time_period income;
datalines;
1,1,45
1,2,32
1,3,67
1,4,34
2,1,23
2,2,89
2,3,78
2,4,10
;
run;

proc sql;
  create table WANT as
  select  A.*,
          (select SUM(INCOME) from WORK.HAVE where ID=A.ID and TIME_PERIOD <= A.TIME_PERIOD) as TOTAL,
          CALCULATED TOTAL / (select COUNT(ID) from WORK.HAVE where ID=A.ID and TIME_PERIOD <= A.TIME_PERIOD) as CUMULATIVE_AVG
  from    HAVE A;
quit;

View solution in original post

5 REPLIES 5
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

Well you could use a retain to retain current average, and then reset that at the beginning of the group:

data average ;

     set data;

     by id;

     retain  average total;

     if first.id then do;

          total=income;

          average=income;

     end;

     else do;

          total = total + income;

          average = ...;

     end;

run;

Jessica98
Calcite | Level 5

Hi,

I am getting an error if i run this,. The problem is i need a 'n' that corresponds a cumulative number from each unique ID, so that i can divide it with the total income calculated fro unique ID/

data average ;

     set data;

     by id;

     retain  average total;

     if first.id then do;

          total=income;

          average=income;

     end;

     else do;

          total = total + income;

          average = total/_n_ ;

     end;

run;

Jessica98
Calcite | Level 5

Could you post an answer here, the other post i wasn't really clear with the problem. The code you gave me worked, all i need is some clarification on your last line "average = ...; " could you expand this?

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

Maybe best to close out one of the posts.

As for your point:

data average ;

     set data;

     by id;

     retain  average total n;

     if first.id then do;

          n=1;

          total=income;

          average=income;

     end;

     else do;

          n=n+1;

          total = total + income;

          average = total/n;

     end;

run;

Something like that, though I haven't tested it.  Though as per other post, you could also SQL it:

data have;
  attrib id time_period income format=best.;
  infile datalines delimiter=",";
  input id time_period income;
datalines;
1,1,45
1,2,32
1,3,67
1,4,34
2,1,23
2,2,89
2,3,78
2,4,10
;
run;

proc sql;
  create table WANT as
  select  A.*,
          (select SUM(INCOME) from WORK.HAVE where ID=A.ID and TIME_PERIOD <= A.TIME_PERIOD) as TOTAL,
          CALCULATED TOTAL / (select COUNT(ID) from WORK.HAVE where ID=A.ID and TIME_PERIOD <= A.TIME_PERIOD) as CUMULATIVE_AVG
  from    HAVE A;
quit;

Jessica98
Calcite | Level 5

thanks a lot! that works perfectly! Smiley Happy

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 2374 views
  • 3 likes
  • 2 in conversation