Calculate average result excluding current

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 123
Accepted Solution

Calculate average result excluding current

Hi all,

 

I'm trying to do an analysis of student results. For this I'm measuring performance of a student in a particular module vs average performance in all other modules. Can anyone advise how I could go about calculating this?

 

My data is very straightforward, 3 columns:

Stud_id

module_id

result

 

Thank you

 


Accepted Solutions
Solution
‎09-29-2016 03:33 PM
Super User
Posts: 5,498

Re: Calculate average result excluding current

Posted in reply to itchyeyeballs

OK, with one observation per student per module, the programming isn't too difficult:

 

proc sort data=have;

   by stud_id;

run;

 

proc summary data=have;

   by stud_id;

   var result;

   output out=summary_stats (keep=n_modules total_result) sum=total_result n=n_modules;

run;

 

data want;

   merge have summary_stats;

   by stud_id;

   other_modules_mean = (total_result - result) / (n_modules - 1);

run;

View solution in original post


All Replies
Super User
Posts: 5,498

Re: Calculate average result excluding current

Posted in reply to itchyeyeballs

Is each student limited to a single observation per module?

 

If not, you might have to provide an example of the results you are looking for (just for one student, but with different numbers of observations per module).

Frequent Contributor
Posts: 123

Re: Calculate average result excluding current

Posted in reply to Astounding

Hi,

 

Yes, each student has a single result per module.

 

Thank you

Solution
‎09-29-2016 03:33 PM
Super User
Posts: 5,498

Re: Calculate average result excluding current

Posted in reply to itchyeyeballs

OK, with one observation per student per module, the programming isn't too difficult:

 

proc sort data=have;

   by stud_id;

run;

 

proc summary data=have;

   by stud_id;

   var result;

   output out=summary_stats (keep=n_modules total_result) sum=total_result n=n_modules;

run;

 

data want;

   merge have summary_stats;

   by stud_id;

   other_modules_mean = (total_result - result) / (n_modules - 1);

run;

Frequent Contributor
Posts: 123

Re: Calculate average result excluding current

Posted in reply to Astounding

Thank you, very much appreciated!

 

Can I push my luck and ask you to annotate what the code is doing? I'm coming from an MS world so this is very different to me

 

Super User
Posts: 5,498

Re: Calculate average result excluding current

Posted in reply to itchyeyeballs

I can give you the big picture ... but there's no substitute for learning SAS.

 

In SAS programs, each step runs in order, and independently of the other steps.  So PROC SORT completes, then PROC SUMMARY begins.  PROC SUMMARY completes, then the DATA step runs.

 

PROC SORT puts the observations in order.  That is necessary to permit processing BY STUD_ID in later steps.

 

PROC SUMMARY creates a summary data set named SUMMARY_STATS.  It contains one observation per STUD_ID, with two variables.  TOTAL_RESULT is the sum of all the RESULT values for that STUD_ID, and N_MODULES is the number of observations for that STUD_ID.

 

The DATA step adds those two variables to every original observation, matching by STUD_ID.  That match gives you all the information needed for calculations.

Frequent Contributor
Posts: 123

Re: Calculate average result excluding current

Posted in reply to Astounding

Thank you so much,

 

embarrassingly it was the basic logic that was eluding me, I was thinking about how to dynamically exclude one row, it didnt ocur to me to just add everything up then subbract, need to remeber to take a step back sometimes!

 

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 460 views
  • 2 likes
  • 2 in conversation