BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
michokwu
Quartz | Level 8

Hello Experts,

 

I have a dataset (see attached) with information for students in an examination.

I would like to calculate the weighted average of all levels for each unit by category and year

E.g

for Unit 1090, English, 2010

weight(level 1) =((Attended(Level 1)/Attended(1)+.............Attended(Level B2))>>>>>>for all levels

weighted_mean=sum(weight(Level 1)*mean_score(Level 1)..............weight(Level B2)*mean_score(Level B2))>>>>>>>for all levels

I tried to create macros for each variable required to compute the weighted mean but it takes forever to run.

What's the best way to do this? can I create all the macros in a single batch of codes? and how do I put the macros together to calculate the weighted mean.

 

%macro unit_loop(start,end);
proc sql;
	%do unit=&start. %TO &end.;
	create table exam.weighted&unit. as
	select *from exam.scores where unit_code=&unit.;
	%END
	quit;
%mend;
%unit_loop(start=1090,end=401401)

Thank you,

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User
PROC MEANS with a BY statement is more efficient. No macros.

Which variable is the weight?

Quick example of PROC MEANS is here. You can include multiple variables in a BY statement to create whatever groups you want. You can explicitly control it in more detail within PROC MEANS using the WAYS and TYPES statements and it will generate the averages at the levels specified including overalls.

SAS (and very few statistical software packages) has an incredible level of flexibility in calculating summary statistics, but SQL does not.

https://github.com/statgeek/SAS-Tutorials/blob/master/proc_means_basic.sas

To add in the weighted portion - add in a "WEIGHT weightVariableName;" to indicate which variable contains the weights

View solution in original post

6 REPLIES 6
Reeza
Super User
PROC MEANS with a BY statement is more efficient. No macros.

Which variable is the weight?

Quick example of PROC MEANS is here. You can include multiple variables in a BY statement to create whatever groups you want. You can explicitly control it in more detail within PROC MEANS using the WAYS and TYPES statements and it will generate the averages at the levels specified including overalls.

SAS (and very few statistical software packages) has an incredible level of flexibility in calculating summary statistics, but SQL does not.

https://github.com/statgeek/SAS-Tutorials/blob/master/proc_means_basic.sas

To add in the weighted portion - add in a "WEIGHT weightVariableName;" to indicate which variable contains the weights
michokwu
Quartz | Level 8
Thank you Reeza. Weight is not given, I have to compute it using the number of students who attended the examination
Reeza
Super User
Wouldn't that be the attended number?
ballardw
Super User

Do you have the data at the student level, as in each thing graded/scored?

Likely a lot easier.

If multiple levels are needed you might consider Proc means/ summary with class statements.

You should have access to the SASHELP.Class data set so can run this code.

Proc summary data=sashelp.class;
   class sex age;
   var height weight;
   output out=work.summary mean=  n= /autoname;
run;

The output has summaries by combinations of the class variables as indicated by the variable _type_ in the output data set. _type_ = 0 is the summary across all records; 1 is the summary within the levels of age only, 2 is the summary within levels of sex only and 3 is the summary of combinations of sex and age.

We summarize two variables and the mean and count per _type_. The autoname option names the output variables: height_mean weight_mean height_n weight_n.

 

Which is one reason we say likely no macro is needed.

Starting with summarized data may be problematic. The proc means/summary has an optional FREQ statement that is used to indicate a variable that contains how many original records are represented. So I think that Freq attend; may be appropriate for this.

michokwu
Quartz | Level 8
Ballardw, thank you, this is very helpful. I don't have student-level data, just aggregated values. For each observation, the proc mean/summary outputs a mean. What I want is to have a mean score for each unit, category and year. E.g. Unit 1090 is expected to have 1 mean score for English in 2010, compared to 1 mean score for each grade level in English in 2010. Thank you.
Reeza
Super User
And PROC MEANS can do that exactly. Have you tried it and it doesn't work in some manner?

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1505 views
  • 3 likes
  • 3 in conversation