Solved: Re: Calculate the average value from two data sets when occurs duplica...

Efka · Posted 04-08-2013 04:29 PM

Hello,

I have two data sets, where one column is exactly the same and another holds different values. There are two simple data sets:

First data set: Second data set:

OBS Name Income OBS Name Income

1 John 1201 1 John 5566

2 David 1602 2 David 5981

3 Karen 1553 3 Karen 8930

As You can see, the names are duplicating, however the income column is different.

The question would be - how can I make a single data set where:

1) The names do not duplicate anymore.

2) The data in 'income' column would be calculated as an average value from first and second data sets.

3) If possible, are there more than one solution for this situation? Please suggest the most efficient ways to do this,

because I'll need to apply suggested solution for many big data sets.

The output should be:

New data set

Name Income

John 3383,5 which is average of two previous data sets - ((1201+5566)/2)

David 3791,5

Karen 5241,5

Thank You.

Sudhakar_A · Posted 04-09-2013 12:50 AM

another simple method.

data fin;

merge one two(rename=(income=income1));

by name;

avgincome=mean(income, income1);

run;

View solution in original post

SandyH · Posted 04-08-2013 08:47 PM

1. set the two datasets with two columns (name, income).

2. proc sql;

avg(income) as income

group by name

OR

merge the two datasets by name, with three columns. then calculate average.

art297 · Posted 04-09-2013 12:14 AM

or. slightly faster:

proc sql;

create table want as

select name,avg(income) as income

from

(select * from one

outer union corr

select * from two)

group by name

;

quit;

Sudhakar_A · Posted 04-09-2013 12:50 AM

another simple method.

data fin;

merge one two(rename=(income=income1));

by name;

avgincome=mean(income, income1);

run;

Calculate the average value from two data sets when occurs duplication in specific column

Re: Calculate the average value from two data sets when occurs duplication in specific column

Re: Calculate the average value from two data sets when occurs duplication in specific column

Re: Calculate the average value from two data sets when occurs duplication in specific column

Re: Calculate the average value from two data sets when occurs duplication in specific column

Calculate the average value from two data sets when occurs duplication in specific column

Re: Calculate the average value from two data sets when occurs duplication in specific column

Re: Calculate the average value from two data sets when occurs duplication in specific column

Re: Calculate the average value from two data sets when occurs duplication in specific column

Re: Calculate the average value from two data sets when occurs duplication in specific column

Ready to join fellow brilliant minds for the SAS Hackathon?

Click image to register for webinar

Classroom Training Available!