DATA Step, Macro, Functions and more

Merging Observations In A DataSet

Reply
N/A
Posts: 0

Merging Observations In A DataSet

Disclaimer: I had a hard time coming up with a title for this thread, so my apologies if the title is misleading and/or this is not posted in the appropriate section of the forums.

My Question:

In my example data set I have multiple observations for 'orange' 'apple' 'banana' etc... and next to these I hava varaibles 'taste' 'flavor'. And the values for taste and flavor are numeric (1-5) based on survey feedback. So, my data set might look like this:

Fruit Taste Flavor
Apple 1 4
Banana 1 5
Banana 3 4
Apple 2 4
Orange 5 1
Orange 2 4
Orange 3 3

I would like to be able to combine all observations for 'apple', 'banana', etc. into a single observation where 'taste' and 'flavor' now contain the average of the responses to taste and flavor for 'apple' for all survey responses.

For example, I want my dataset to now look like this (a single row for each type of fruit, and the responses for taste and flavor now averaged):

Fruit Taste Flavor
Apple 1.5 4
Banana 2 4.5
Orange 3.33 2.66


All help is greatly appreciated! Yes, I'm pretty new to this!

Thanks all!
SAS Employee
Posts: 73

Re: Merging Observations In A DataSet

data a;
input x $ y z;
cards;
Apple 1 4
Banana 1 5
Banana 3 4
Apple 2 4
Orange 5 1
Orange 2 4
Orange 3 3
;
run;
proc sort;
by x;
run;

proc means data=a noprint;
by x;
output out=b(drop=_type_ _freq_) mean=;
run;
N/A
Posts: 0

Re: Merging Observations In A DataSet

Pavan,

Thank you for your time and the example. My data is already in a dataset, so it is my understanding that I will just leave off the data step you included.

My example was a very simplistic one which captured the problem I am facing with a much larger dataset.

To take this example a step further, I have another question:

How would I go about creating a 4th variable (we'll call it Overall Taste), which would essentially just be an average of y and z (from your example)? So, in your example, in the proc means step y and z are an 'average' in the output dataset... I would like a 4th variable (we'll call it 'a')... which is the average of y and z.

All help is GREATLY appreciated!
PROC Star
Posts: 1,564

Re: Merging Observations In A DataSet

To summarise the summary, you need two steps.

data a;
input Fruit $ Taste Flavor;
cards;
Apple 1 4
Banana 1 5
Banana 3 4
Apple 2 4
Orange 5 1
Orange 2 4
Orange 3 3
;
run;
proc means data=a noprint nway;
class fruit;
output out=b(drop=_type_ _freq_) mean=;
run;
data c;
set b;
overall=mean(Taste,Flavor);
run;

Note that you could have 2000 oranges and 2 apples, and their respective mean would weight the same. Ask yourself if you really want the mean of the mean, or whether you want to weigh the averages.
N/A
Posts: 0

Re: Merging Observations In A DataSet

Hi,

Please try this, hope this will help you

data a;
input Fruit $ Taste Flavor;
cards;
Apple 1 4
Banana 1 5
Banana 3 4
Apple 2 4
Orange 5 1
Orange 2 4
Orange 3 3
;
run;
proc sql;
select fruit,mean(taste) as taste,mean(flavor) as flavour, mean(mean(taste),mean(flavor)) as AVG_TASTE_FLAVOUR
from a
group by fruit;
quit;

Regards,
skm
Ask a Question
Discussion stats
  • 4 replies
  • 130 views
  • 0 likes
  • 3 in conversation