BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Disclaimer: I had a hard time coming up with a title for this thread, so my apologies if the title is misleading and/or this is not posted in the appropriate section of the forums.

My Question:

In my example data set I have multiple observations for 'orange' 'apple' 'banana' etc... and next to these I hava varaibles 'taste' 'flavor'. And the values for taste and flavor are numeric (1-5) based on survey feedback. So, my data set might look like this:

Fruit Taste Flavor
Apple 1 4
Banana 1 5
Banana 3 4
Apple 2 4
Orange 5 1
Orange 2 4
Orange 3 3

I would like to be able to combine all observations for 'apple', 'banana', etc. into a single observation where 'taste' and 'flavor' now contain the average of the responses to taste and flavor for 'apple' for all survey responses.

For example, I want my dataset to now look like this (a single row for each type of fruit, and the responses for taste and flavor now averaged):

Fruit Taste Flavor
Apple 1.5 4
Banana 2 4.5
Orange 3.33 2.66


All help is greatly appreciated! Yes, I'm pretty new to this!

Thanks all!
4 REPLIES 4
Pavan_SAS
SAS Employee
data a;
input x $ y z;
cards;
Apple 1 4
Banana 1 5
Banana 3 4
Apple 2 4
Orange 5 1
Orange 2 4
Orange 3 3
;
run;
proc sort;
by x;
run;

proc means data=a noprint;
by x;
output out=b(drop=_type_ _freq_) mean=;
run;
deleted_user
Not applicable
Pavan,

Thank you for your time and the example. My data is already in a dataset, so it is my understanding that I will just leave off the data step you included.

My example was a very simplistic one which captured the problem I am facing with a much larger dataset.

To take this example a step further, I have another question:

How would I go about creating a 4th variable (we'll call it Overall Taste), which would essentially just be an average of y and z (from your example)? So, in your example, in the proc means step y and z are an 'average' in the output dataset... I would like a 4th variable (we'll call it 'a')... which is the average of y and z.

All help is GREATLY appreciated!
ChrisNZ
Tourmaline | Level 20
To summarise the summary, you need two steps.

data a;
input Fruit $ Taste Flavor;
cards;
Apple 1 4
Banana 1 5
Banana 3 4
Apple 2 4
Orange 5 1
Orange 2 4
Orange 3 3
;
run;
proc means data=a noprint nway;
class fruit;
output out=b(drop=_type_ _freq_) mean=;
run;
data c;
set b;
overall=mean(Taste,Flavor);
run;

Note that you could have 2000 oranges and 2 apples, and their respective mean would weight the same. Ask yourself if you really want the mean of the mean, or whether you want to weigh the averages.
deleted_user
Not applicable
Hi,

Please try this, hope this will help you

data a;
input Fruit $ Taste Flavor;
cards;
Apple 1 4
Banana 1 5
Banana 3 4
Apple 2 4
Orange 5 1
Orange 2 4
Orange 3 3
;
run;
proc sql;
select fruit,mean(taste) as taste,mean(flavor) as flavour, mean(mean(taste),mean(flavor)) as AVG_TASTE_FLAVOUR
from a
group by fruit;
quit;

Regards,
skm

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1210 views
  • 0 likes
  • 3 in conversation