DATA Step, Macro, Functions and more

Calculating average

Accepted Solution Solved
Reply
Regular Contributor
Posts: 215
Accepted Solution

Calculating average

Hi All,

I have student's grade. I want want average of student's grade, but

when student ID=102, I want the average of top 3 courses average and 

when student ID=103, I want the average of top 4 courses. 

 

Can someone help please. Thanks,

 

data have;
input ID Course $ Score;
cards;
101 Eng 20
101 Bio 20
101 Math 18

105 Eng 15

105 Math 15

105 Phy 15
102 Eng 15
102 Bio 20
102 Phy 16
102 Arts 12
103 Bio 18
103 Math 20
103 Eng 16
103 Che 15
103 Phy 17
RUN;

 

Output will be like this:

 

ID Course Score Average
101 Eng 20 19.3
101 Bio 20 19.3
101 Math 18 19.3

105 Eng 15 15

105 Math 15 15

105 Phy 15 15

102 Eng 15 17
102 Bio 20 17
102 Phy 16 17
102 Arts 12 17
103 Bio 18 18.25
103 Math 20 18.25
103 Eng 16 18.25
103 Che 15 18.25
103 Phy 19 18.25


Accepted Solutions
Solution
‎04-29-2016 01:47 PM
Super User
Posts: 9,671

Re: Calculating average

data have;
input ID Course $ Score;
cards; 
101 Eng 20
101 Bio 20
101 Math 18
105 Eng 15
105 Math 15
105 Phy 15
102 Eng 15
102 Bio 20
102 Phy 16 
102 Arts 12
103 Bio 18
103 Math 20
103 Eng 16
103 Che 15
103 Phy 17
;
RUN;
data want;
n=0;sum=0;
 do until(last.id);
  set have;
  by id notsorted;
  if not missing(Score) then n+1; 
  if id =102 then do;
   if n lt 4 then sum+Score;
  end;
  else if id =103 then do;
   if n lt 5 then sum+Score;
  end;
  else do;
   sum+Score;
  end;
 end;
 
 if id =102 then average=sum/3; 
  else if id =103 then average=sum/4; 
   else average=sum/n; 
   
 do until(last.id);
  set have;
  by id notsorted;
  output;
 end;
drop n sum;
run;
  

View solution in original post


All Replies
Super User
Posts: 17,758

Re: Calculating average

How are those rules defined? Based on data or outside knowledge? What about other IDs?

Regular Contributor
Posts: 215

Re: Calculating average

Rules are defined based on data that I have.
Respected Advisor
Posts: 4,641

Re: Calculating average

So, what's the rule for studentId = 101?

PG
Regular Contributor
Posts: 215

Re: Calculating average

Hi PG,
I want the average of all student (ID represent the student). The only two exceptions are 1) when the student ID=102, the average will be the average of top 3 course grade

2) when the student ID=103, the average will be the average of top 4 course grade
My output sample says everything.

Thanks,
Super User
Posts: 17,758

Re: Calculating average


mlogan wrote:



My output sample says everything.



If it did, we wouldn't be asking. 

 

Sort dataset by ID and descending grades

Create a record counter and use that to filter your data set

to only records you want to include in Average.

Calculate average via proc means or sql

Merge results back with table

 

OR

step1 same as above

Then create an indicator variable (1/0) that denotes if the record contributes to average. 

Use SQL to calculate and merge results using indicator variables - sum(indicator*value)/sum(indicator) as average

 

Respected Advisor
Posts: 4,641

Re: Calculating average

Assuming your Exceptions data is expressed in a dataset:

 

data have;
input ID Course $ Score;
cards; 
101 Eng 20
101 Bio 20
101 Math 18
105 Eng 15
105 Math 15
105 Phy 15
102 Eng 15
102 Bio 20
102 Phy 16 
102 Arts 12
103 Bio 18
103 Math 20
103 Eng 16
103 Che 15
103 Phy 17
;

data exceptions;
input ID nbTop;
datalines;
102 3
103 4
;

proc sort data=have; by ID descending score; run;

data countingScores;
merge have exceptions (in=ex); by ID;
if first.ID then do;
    order = 0;
    sumScore = 0;
    end;
order + 1;
if not (ex and order > nbTop) then sumScore + Score;
if last.ID then do;
    meanScore = sumScore / min(order, nbTop);
    output;
    end;
keep ID meanScore;
run;

proc print data=countingScores noobs; run;
PG
Solution
‎04-29-2016 01:47 PM
Super User
Posts: 9,671

Re: Calculating average

data have;
input ID Course $ Score;
cards; 
101 Eng 20
101 Bio 20
101 Math 18
105 Eng 15
105 Math 15
105 Phy 15
102 Eng 15
102 Bio 20
102 Phy 16 
102 Arts 12
103 Bio 18
103 Math 20
103 Eng 16
103 Che 15
103 Phy 17
;
RUN;
data want;
n=0;sum=0;
 do until(last.id);
  set have;
  by id notsorted;
  if not missing(Score) then n+1; 
  if id =102 then do;
   if n lt 4 then sum+Score;
  end;
  else if id =103 then do;
   if n lt 5 then sum+Score;
  end;
  else do;
   sum+Score;
  end;
 end;
 
 if id =102 then average=sum/3; 
  else if id =103 then average=sum/4; 
   else average=sum/n; 
   
 do until(last.id);
  set have;
  by id notsorted;
  output;
 end;
drop n sum;
run;
  
Regular Contributor
Posts: 215

Re: Calculating average

Hi Xia,
Thanks for the solution. What I understand the only thing I had to add to your code is sort them by desc Grade.

Now I modified the code and trying to get the weighted average. (e.g: (Score1*credit + Score2* Credit + Score3*Credit)/3)

My cumulative weighted score (cum_wt_score) is not showing top 3 for Grade11 and top 4 Grade12. it is counting all the to the end of last.id. Can you please help. Thanks.



data have;
input Grade $ ST_ID Course $ Score Credit;
cards;
Grade10 101 Eng 20 1
Grade10 101 Bio 25 .5
Grade10 101 Math 25 1
Grade10 101 Che 28 1
Grade10 101 phy 20 1
Grade11 105 Bio 25 1
Grade11 105 Math 18 1
Grade11 105 Eng 15 .5
Grade11 105 Math 15 1
Grade11 105 Phy 15 1
Grade11 102 Eng 15 1
Grade11 102 Bio 20 .5
Grade11 102 Psy 23 1
Grade11 102 Che 26 1
Grade11 102 Phy 16 1
Grade11 102 Arts 12 1
Grade12 103 Bio 10 1
Grade12 103 Math 20 1
Grade12 103 Eng 16 1
Grade12 103 Che 15 1
Grade12 103 Phy 17 1
;
RUN;

PROC SORT DATA=Have
OUT=Have;
BY Grade St_ID descending Score;
RUN;


DATA want;
n=0; cum_credit=0; cum_wt_score=0;
DO UNTIL(last.st_id);
SET have;
BY st_id NOTSORTED;
IF not missing(Score) THEN n+1;

IF Grade ='Grade11' THEN DO;
IF n lt 4 THEN cum_credit+credit; cum_wt_score+(score*credit);
END;

ELSE IF Grade ='Grade12' THEN DO;
IF n lt 5 THEN cum_credit+credit; cum_wt_score+(score*credit);
END;

ELSE DO;
cum_credit+credit; cum_wt_score+(score*credit);
END;
END;

average=cum_wt_score/cum_credit;

DO UNTIL(last.st_id);
SET have;
BY st_id NOTSORTED;
OUTPUT;
END;


RUN;
Regular Contributor
Posts: 215

Re: Calculating average

I think I got it. I should not write like "IF n lt 4 THEN cum_credit+credit; cum_wt_score+(score*credit); ". It should go in two separate if statement.

Thanks,
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 475 views
  • 2 likes
  • 4 in conversation