Solved
Not applicable
Posts: 0

# Average Multiple Variables

Hello All,

This is a bit of continuation of a question, I previously answered, so please bear with me.

1) Here is my example dataset:

data a;
input Fruit \$ Ques1 Quest2;
cards;
Apple 1 4
Banana 1 5
Banana 3 4
Apple 2 4
Orange 5 1
Orange 2 4
Orange 3 3
;
run;

2) I would like to find the average for all responses responses to Ques1 & Quest2 (combined). Think of this as the average response for all all the questions of all fruits combined.

3) I would like to find the average for all responses where Fruit='Apple' or Fruit='Banana' etc. Think of this as where I want the average question response to all questions for just bananas, then just apples, etc.

4. How could I limit this to just particular variables? For instance if I wanted the average response to only Ques1 for *just* bananas?

Thanks to all of you again for your time.

_jeff_

Accepted Solutions
Solution
‎08-31-2017 02:56 PM
Not applicable
Posts: 0

## Re: Average Multiple Variables

[ Edited ]

Editor's Note: This solution, along with the addition of the following in a later post:

Change the statement to avg(ques1+quest2) as avg to avg(ques+quest2)/2 as avg in select statement in 2 and 3...

provide answers to all questions asked. Thanks also to @SteveDenham for providing a PROC MEANS alternative.

You can also use Proc Sql with Group by and Having clauses, if you are good with Proc Sql;

Below is the code for the your questions 2,3 4 in the order:

proc sql;
create table b as
select fruit, ques1, quest2, avg(ques1+quest2) as avg
from a;
quit;

proc sql;
create table c as
select fruit, ques1 , quest2, avg(ques1+quest2) as avg1
from a
group by fruit
having fruit in ('Banana', 'Apple');
quit;

proc sql;
create table d as
select fruit, ques1 , quest2, avg(ques1) as avg2
from a
group by fruit
having fruit in ('Banana');
quit;

~ Sukanya E

All Replies
Super Contributor
Posts: 3,176

## Re: Average Multiple Variables

Have a look at using PROC MEANS with NOPRINT, a CLASS statement, and an OUTPUT statement, and review the generated results with focus on _TYPE_ variable.

Scott Barry
SBBWorks, Inc.
Contributor
Posts: 29

## Re: Average Multiple Variables

Hi Jeff, have you tried Proc Means? Here are examples, in order of your questions.

(2)

Proc Means data=a;
output out=means mean=;
Run; Quit;

(3)

Proc Sort data=a; by fruit; run;
Proc Means data=a;
by fruit;
output out=means mean=;
Run; Quit;

(4)

Proc Means data=a;
var ques1 quest2;
where fruit='Banana';
output out=means mean=;
Run; Quit;

Happy SASsing! Tricia (Added Proc Sort to #3.)

Message was edited by: sassygrl
Not applicable
Posts: 0

## Re: Average Multiple Variables

sasygirl -

Thank you for your time and response. I like the examples which you have provided, the only downside being that each of your solutions keeps the variables separate.

For instance, for #2, I want the mean for all responses to both Ques1 and Quest2 (combined) . Not as separate values. When running your solution to #2, I get the following:

Ques1 7 2.4285714
Quest2 7 3.5714286

What I want is one number/mean which represents the mean for all responses to both Quest1 and Quest2.

My 'real world' dataset is survey, where the first 5 questions represent the employees responses to 'Job related' questions... and I'm trying to create one number which is "Overall Job". This number is the average response from all users to all questions. This number will be used as a baseline to compare individual divisions against (hence #3 in my original post).

Thanks!
Super Contributor
Posts: 3,176

## Re: Average Multiple Variables

Scott Barry
SBBWorks, Inc.
Solution
‎08-31-2017 02:56 PM
Not applicable
Posts: 0

## Re: Average Multiple Variables

[ Edited ]

Editor's Note: This solution, along with the addition of the following in a later post:

Change the statement to avg(ques1+quest2) as avg to avg(ques+quest2)/2 as avg in select statement in 2 and 3...

provide answers to all questions asked. Thanks also to @SteveDenham for providing a PROC MEANS alternative.

You can also use Proc Sql with Group by and Having clauses, if you are good with Proc Sql;

Below is the code for the your questions 2,3 4 in the order:

proc sql;
create table b as
select fruit, ques1, quest2, avg(ques1+quest2) as avg
from a;
quit;

proc sql;
create table c as
select fruit, ques1 , quest2, avg(ques1+quest2) as avg1
from a
group by fruit
having fruit in ('Banana', 'Apple');
quit;

proc sql;
create table d as
select fruit, ques1 , quest2, avg(ques1) as avg2
from a
group by fruit
having fruit in ('Banana');
quit;

~ Sukanya E

Not applicable
Posts: 0

## Re: Average Multiple Variables

Sukanya,

Thank you for your time as well, however, your response to #2 creates incorrect results.

proc sql;
create table b as
select fruit, ques1, quest2, avg(ques1+quest2) as avg
from a;
quit;

It shows the 'avg' as being '6', which can't be correct given that my survey response values only range from 1-4.

Just wanted to share what I noticed.

thanks!
Not applicable
Posts: 0

## Re: Average Multiple Variables

Oh..You said you wanted the average of sum of ques1 and quest2. Average of both the responses give the result 6 as per the data. Is not that you wanted ?? Am I misisng some thing ?? How about the other results 3, 4 ??

~ Sukanya E
Not applicable
Posts: 0

## Re: Average Multiple Variables

I apologize if I am not explaining my need very well. Let me try an example

If my dataset looked like this:

Apple 1 2
Banana 3 4
Orange 2 3

I want the 'average' of all responses. So, in this dataset I have a total of 6 responses that have been recorded. So the 'math' would look like this for this small dataset:

1 + 2 + 3 + 4 + 2 + 3 = 15
15 / 6 = 2.5

So, my overall average is 2.5

Does that help clarify? Again, my apologies
Not applicable
Posts: 0

## Re: Average Multiple Variables

Oh...I got it...Just divide by 2....Change the statement to avg(ques1+quest2) as avg to avg(ques+quest2)/2 as avg in select statement in 2 and 3...

I hope this works for you now...

~ Sukanya E
Not applicable
Posts: 0

## Re: Average Multiple Variables

Perfect! Thank you for your time AND patience! I haven't verified #4 yet, but by dividing by 2, all seems to work perfectly for #2 and #3.
Not applicable
Posts: 0

## Re: Average Multiple Variables

For #4, you don't have to divide by 2 as you are considering only for Ques1...
Hope this makes sense..

~ Sukanya E
Not applicable
Posts: 0

## Re: Average Multiple Variables

If you want the separate means for ques1 and quest2, then change the statement avg(ques1+quest2) as avg to avg(ques1) as qvg1, avg(quest2) as qvg2 in select statement.

~ Sukanya E
Posts: 2,655

## Re: Average Multiple Variables

Hello Jeff,

Try the following code, after what you have:

data long;
set a;
ques=1;response=ques1;output;
ques=2;response=quest2;output;
drop ques1 quest2;
run;

proc means data=long noprint;
class fruit ques;
var response;
output out=manymeans mean=mean;
run;

Now take a look at the dataset manymeans. In it is a variable called _TYPE_. By looking at the appropriate value of _TYPE_, you will get exactly the kinds of averages you were talking about.

_TYPE_=0 gives the grand mean across all questions and fruit types. (your point #2)
_TYPE_=1 gives the mean of each question (1 and 2), across all fruit types.
_TYPE_=2 gives the mean of each fruit type, across both questions. (your point #3)
_TYPE_=3 gives the mean of each question for each fruit type. (your point #4)

Good luck with this.

Steve Denham
🔒 This topic is solved and locked.