Solved: Merging data sets

Paul_NYS · Posted 02-21-2013 10:03 AM

Hi

I would like to merge two data sets with the intent of just adding one column (Denominator) to the "s6Freq72Numerator" data set. I thought I was able to easily do this in the past, but maybe not. The code I thought would be successful in doing this is below. However, when I use this, the resulting data set is not sorted by exitMonthCategory and and there are missing records from "s6Freq72Numerator".

I am attaching a spreadsheet that has an example of the data from the "s6Freq72Numerator" and "s6Freq72Denominator" data sets and what I was intending as the result.

If anyone has any ideas, please let me know.

Paul

proc sort data=s6Freq72Numerator;

by cnty_name startyear exitMonthCategory;

run;

proc sort data=s6Freq72Denominator;

by cnty_name startyear exitMonthCategory;

run;

data s6Final72;

merge s6Freq72Numerator s6Freq72Denominator;

by cnty_name startyear;

run;

art297 · Posted 02-21-2013 12:11 PM

Paul,

The following worked for me:

data s6Final72;

merge s6Freq72Denominator (drop=exitMonthCategory) s6Freq72Numerator (in=a);

by cnty_name startyear;

if a;

run;

View solution in original post

art297 · Posted 02-21-2013 11:27 AM

Paul,

Without looking at your data, why don't you just try adding exitMonthCategory at the end of the merge by statement?

Paul_NYS · Posted 02-21-2013 11:41 AM

Hi Art

Because it will only print the Denominator to the one value that matches on the exitMonthCategory value. There are other values of exitMonthCategory in s6Freq72Numerator that will be blank. I tried that already actually.

Paul

ballardw · Posted 02-21-2013 11:28 AM

Since both of your input data sets are sorted by cnty_name startyear and exitMonthCategory the output of the merged data set is sorted on the same variables. If you need the output sorted by exitMonthCategory you need another sort after the merge.

You probably have missing values for your numerator variable because the combination of the sort variables didn't exist in the numerator data set but did exist in the denominator set. Or you have multiples.

Alternative approach might be

(This does not require sorting the data first)

proc sql;

create table s6Final72 as

select a.*, b.s6Freq72Numerator

froms 6Freq72Denominator as a left joint s6Freq72Numerator as b on

a.cnty_name=b.cnty_name and a.startyear=b.startyear and a.exitMonthCategory=b.exitMonthCategory

order by a.exitMonthCategory;

quit;

Paul_NYS · Posted 02-21-2013 11:51 AM

Hi Ballard

I didn't think of using proc sql, but will give it a shot.

Paul

kij23 · Posted 02-21-2013 11:32 AM

Hi Paul,

Below code will give you the desired output, exactly matching to your Expected results -

data s6Final72;

merge s6Freq72Denominator s6Freq72Numerator(in=a) ;

by cnty_name startyear;
if a;
run;

Regards

Paul_NYS · Posted 02-21-2013 11:49 AM

Hi Kij

I actually tried that and it comes close, however for some reason the '6' value of exitMonthCategory in s6Freq72Numerator gets renamed to the exitMonthCategory in s6Freq72Denominator and the CumulativeNumber is set to the Denominator in s6Freq72Denominator for the given county/year combo. If I can't find any proper way to do this, then I was going to do what you have below and rename all the '6' values back to their correct records.

Paul

art297 · Posted 02-21-2013 12:11 PM

Paul,

The following worked for me:

data s6Final72;

merge s6Freq72Denominator (drop=exitMonthCategory) s6Freq72Numerator (in=a);

by cnty_name startyear;

if a;

run;

DanWALDO · Posted 02-21-2013 12:37 PM

The important feature of this and kij23's reply is that the denominator file precedes the numerator file. When that's the case, the drop= option should not be necessary, I think ... but it's always cleaner.

Paul_NYS · Posted 02-21-2013 12:59 PM

Hi Art

The second one worked perfectly. Thanks a lot. I guess I originally missed the drop statement then.

Paul

art297 · Posted 02-21-2013 01:50 PM

Paul,

As Dan pointed out, the drop statement really wasn't even necessary.

The problem you had was (1) using the denominator file as the 2nd rather than the first file and (2) not using an in= option to exclude irrelevant records from the denominator file.

Merging data sets

Re: Merging data sets

Re: Merging data sets

Re: Merging data sets

Re: Merging data sets

Re: Merging data sets

Re: Merging data sets

Re: Merging data sets

Re: Merging data sets

Re: Merging data sets

Re: Merging data sets

Re: Merging data sets

Catch up on SAS Innovate 2026

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away