turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- Average Multiple Variables

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-01-2009 01:51 PM

Hello All,

This is a bit of continuation of a question, I previously answered, so please bear with me.

1) Here is my example dataset:

data a;

input Fruit $ Ques1 Quest2;

cards;

Apple 1 4

Banana 1 5

Banana 3 4

Apple 2 4

Orange 5 1

Orange 2 4

Orange 3 3

;

run;

2) I would like to find the average for all responses responses to Ques1 & Quest2 (combined). Think of this as the average response for all all the questions of all fruits combined.

3) I would like to find the average for all responses where Fruit='Apple' or Fruit='Banana' etc. Think of this as where I want the average question response to all questions for just bananas, then just apples, etc.

4. How could I limit this to just particular variables? For instance if I wanted the average response to only Ques1 for *just* bananas?

Thanks to all of you again for your time.

_jeff_

This is a bit of continuation of a question, I previously answered, so please bear with me.

1) Here is my example dataset:

data a;

input Fruit $ Ques1 Quest2;

cards;

Apple 1 4

Banana 1 5

Banana 3 4

Apple 2 4

Orange 5 1

Orange 2 4

Orange 3 3

;

run;

2) I would like to find the average for all responses responses to Ques1 & Quest2 (combined). Think of this as the average response for all all the questions of all fruits combined.

3) I would like to find the average for all responses where Fruit='Apple' or Fruit='Banana' etc. Think of this as where I want the average question response to all questions for just bananas, then just apples, etc.

4. How could I limit this to just particular variables? For instance if I wanted the average response to only Ques1 for *just* bananas?

Thanks to all of you again for your time.

_jeff_

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-01-2009 02:19 PM

Have a look at using PROC MEANS with NOPRINT, a CLASS statement, and an OUTPUT statement, and review the generated results with focus on _TYPE_ variable.

Scott Barry

SBBWorks, Inc.

Scott Barry

SBBWorks, Inc.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-01-2009 02:21 PM

Hi Jeff, have you tried Proc Means? Here are examples, in order of your questions.

(2)

Proc Means data=a;

output out=means mean=;

Run; Quit;

(3)

Proc Sort data=a; by fruit; run;

Proc Means data=a;

by fruit;

output out=means mean=;

Run; Quit;

(4)

Proc Means data=a;

var ques1 quest2;

where fruit='Banana';

output out=means mean=;

Run; Quit;

Happy SASsing! Tricia (Added Proc Sort to #3.)

Message was edited by: sassygrl

(2)

Proc Means data=a;

output out=means mean=;

Run; Quit;

(3)

Proc Sort data=a; by fruit; run;

Proc Means data=a;

by fruit;

output out=means mean=;

Run; Quit;

(4)

Proc Means data=a;

var ques1 quest2;

where fruit='Banana';

output out=means mean=;

Run; Quit;

Happy SASsing! Tricia (Added Proc Sort to #3.)

Message was edited by: sassygrl

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-01-2009 03:07 PM

sasygirl -

Thank you for your time and response. I like the examples which you have provided, the only downside being that each of your solutions keeps the variables separate.

For instance, for #2, I want the mean for all responses to both Ques1 and Quest2 (combined) . Not as separate values. When running your solution to #2, I get the following:

Ques1 7 2.4285714

Quest2 7 3.5714286

What I want is one number/mean which represents the mean for all responses to both Quest1 and Quest2.

My 'real world' dataset is survey, where the first 5 questions represent the employees responses to 'Job related' questions... and I'm trying to create one number which is "Overall Job". This number is the average response from all users to all questions. This number will be used as a baseline to compare individual divisions against (hence #3 in my original post).

Thanks!

Thank you for your time and response. I like the examples which you have provided, the only downside being that each of your solutions keeps the variables separate.

For instance, for #2, I want the mean for all responses to both Ques1 and Quest2 (combined) . Not as separate values. When running your solution to #2, I get the following:

Ques1 7 2.4285714

Quest2 7 3.5714286

What I want is one number/mean which represents the mean for all responses to both Quest1 and Quest2.

My 'real world' dataset is survey, where the first 5 questions represent the employees responses to 'Job related' questions... and I'm trying to create one number which is "Overall Job". This number is the average response from all users to all questions. This number will be used as a baseline to compare individual divisions against (hence #3 in my original post).

Thanks!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-01-2009 03:13 PM

Using a DATA step, generate additional observations masking your BY variable as value "*ALL*" or something.

Scott Barry

SBBWorks, Inc.

Scott Barry

SBBWorks, Inc.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-01-2009 02:34 PM

You can also use Proc Sql with Group by and Having clauses, if you are good with Proc Sql;

Below is the code for the your questions 2,3 4 in the order:

proc sql;

create table b as

select fruit, ques1, quest2, avg(ques1+quest2) as avg

from a;

quit;

proc sql;

create table c as

select fruit, ques1 , quest2, avg(ques1+quest2) as avg1

from a

group by fruit

having fruit in ('Banana', 'Apple');

quit;

proc sql;

create table d as

select fruit, ques1 , quest2, avg(ques1) as avg2

from a

group by fruit

having fruit in ('Banana');

quit;

~ Sukanya E

Below is the code for the your questions 2,3 4 in the order:

proc sql;

create table b as

select fruit, ques1, quest2, avg(ques1+quest2) as avg

from a;

quit;

proc sql;

create table c as

select fruit, ques1 , quest2, avg(ques1+quest2) as avg1

from a

group by fruit

having fruit in ('Banana', 'Apple');

quit;

proc sql;

create table d as

select fruit, ques1 , quest2, avg(ques1) as avg2

from a

group by fruit

having fruit in ('Banana');

quit;

~ Sukanya E

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-01-2009 03:10 PM

Sukanya,

Thank you for your time as well, however, your response to #2 creates incorrect results.

proc sql;

create table b as

select fruit, ques1, quest2, avg(ques1+quest2) as avg

from a;

quit;

It shows the 'avg' as being '6', which can't be correct given that my survey response values only range from 1-4.

Just wanted to share what I noticed.

thanks!

Thank you for your time as well, however, your response to #2 creates incorrect results.

proc sql;

create table b as

select fruit, ques1, quest2, avg(ques1+quest2) as avg

from a;

quit;

It shows the 'avg' as being '6', which can't be correct given that my survey response values only range from 1-4.

Just wanted to share what I noticed.

thanks!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-01-2009 03:18 PM

Oh..You said you wanted the average of sum of ques1 and quest2. Average of both the responses give the result 6 as per the data. Is not that you wanted ?? Am I misisng some thing ?? How about the other results 3, 4 ??

~ Sukanya E

~ Sukanya E

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-01-2009 03:25 PM

I apologize if I am not explaining my need very well. Let me try an example

If my dataset looked like this:

Apple 1 2

Banana 3 4

Orange 2 3

I want the 'average' of all responses. So, in this dataset I have a total of 6 responses that have been recorded. So the 'math' would look like this for this small dataset:

1 + 2 + 3 + 4 + 2 + 3 = 15

15 / 6 = 2.5

So, my overall average is 2.5

Does that help clarify? Again, my apologies

If my dataset looked like this:

Apple 1 2

Banana 3 4

Orange 2 3

I want the 'average' of all responses. So, in this dataset I have a total of 6 responses that have been recorded. So the 'math' would look like this for this small dataset:

1 + 2 + 3 + 4 + 2 + 3 = 15

15 / 6 = 2.5

So, my overall average is 2.5

Does that help clarify? Again, my apologies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-01-2009 03:31 PM

Oh...I got it...Just divide by 2....Change the statement to avg(ques1+quest2) as avg to avg(ques+quest2)/2 as avg in select statement in 2 and 3...

I hope this works for you now...

~ Sukanya E

I hope this works for you now...

~ Sukanya E

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-01-2009 03:39 PM

Perfect! Thank you for your time AND patience! I haven't verified #4 yet, but by dividing by 2, all seems to work perfectly for #2 and #3.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-01-2009 03:46 PM

For #4, you don't have to divide by 2 as you are considering only for Ques1...

Hope this makes sense..

~ Sukanya E

Hope this makes sense..

~ Sukanya E

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-01-2009 03:27 PM

If you want the separate means for ques1 and quest2, then change the statement avg(ques1+quest2) as avg to avg(ques1) as qvg1, avg(quest2) as qvg2 in select statement.

~ Sukanya E

~ Sukanya E

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-01-2009 02:37 PM

Hello Jeff,

Try the following code, after what you have:

data long;

set a;

ques=1;response=ques1;output;

ques=2;response=quest2;output;

drop ques1 quest2;

run;

proc means data=long noprint;

class fruit ques;

var response;

output out=manymeans mean=mean;

run;

Now take a look at the dataset manymeans. In it is a variable called _TYPE_. By looking at the appropriate value of _TYPE_, you will get exactly the kinds of averages you were talking about.

_TYPE_=0 gives the grand mean across all questions and fruit types. (your point #2)

_TYPE_=1 gives the mean of each question (1 and 2), across all fruit types.

_TYPE_=2 gives the mean of each fruit type, across both questions. (your point #3)

_TYPE_=3 gives the mean of each question for each fruit type. (your point #4)

Good luck with this.

Steve Denham

Try the following code, after what you have:

data long;

set a;

ques=1;response=ques1;output;

ques=2;response=quest2;output;

drop ques1 quest2;

run;

proc means data=long noprint;

class fruit ques;

var response;

output out=manymeans mean=mean;

run;

Now take a look at the dataset manymeans. In it is a variable called _TYPE_. By looking at the appropriate value of _TYPE_, you will get exactly the kinds of averages you were talking about.

_TYPE_=0 gives the grand mean across all questions and fruit types. (your point #2)

_TYPE_=1 gives the mean of each question (1 and 2), across all fruit types.

_TYPE_=2 gives the mean of each fruit type, across both questions. (your point #3)

_TYPE_=3 gives the mean of each question for each fruit type. (your point #4)

Good luck with this.

Steve Denham