DATA Step, Macro, Functions and more

Merging data sets

Accepted Solution Solved
Reply
Regular Contributor
Posts: 216
Accepted Solution

Merging data sets

Hi

I would like to merge two data sets with the intent of just adding one column (Denominator) to the "s6Freq72Numerator" data set. I thought I was able to easily do this in the past, but maybe not. The code I thought would be successful in doing this is below. However, when I use this, the resulting data set is not sorted by exitMonthCategory and and there are missing records from "s6Freq72Numerator".

I am attaching a spreadsheet that has an example of the data from the "s6Freq72Numerator" and "s6Freq72Denominator" data sets and what I was intending as the result.

If anyone has any ideas, please let me know.

Paul

proc sort data=s6Freq72Numerator;

by cnty_name startyear exitMonthCategory;

run;

proc sort data=s6Freq72Denominator;

by cnty_name startyear exitMonthCategory;

run;

data s6Final72;

merge s6Freq72Numerator s6Freq72Denominator;

by cnty_name startyear;

run;


Accepted Solutions
Solution
‎02-21-2013 12:11 PM
PROC Star
Posts: 7,363

Re: Merging data sets

Paul,

The following worked for me:

data s6Final72;

  merge  s6Freq72Denominator (drop=exitMonthCategory) s6Freq72Numerator (in=a);

  by cnty_name startyear;

  if a;

run;

View solution in original post


All Replies
PROC Star
Posts: 7,363

Re: Merging data sets

Paul,

Without looking at your data, why don't you just try adding exitMonthCategory at the end of the merge by statement?

Regular Contributor
Posts: 216

Re: Merging data sets

Hi Art

Because it will only print the Denominator to the one value that matches on the exitMonthCategory value. There are other values of exitMonthCategory in s6Freq72Numerator that will be blank. I tried that already actually.

Paul

Super User
Posts: 10,500

Re: Merging data sets

Since both of your input data sets are sorted by cnty_name startyear and exitMonthCategory the output of the merged data set is sorted on the same variables. If you need the output sorted by exitMonthCategory you need another sort after the merge.

You probably have missing values for your numerator variable because the combination of the sort variables didn't exist in the numerator data set but did exist in the denominator set. Or you have multiples.

Alternative approach might be

(This does not require sorting the data first)

proc sql;

     create table s6Final72 as

     select a.*, b.s6Freq72Numerator

     froms 6Freq72Denominator as a left joint s6Freq72Numerator as b on

          a.cnty_name=b.cnty_name and a.startyear=b.startyear and a.exitMonthCategory=b.exitMonthCategory

    order by a.exitMonthCategory;

quit;

Regular Contributor
Posts: 216

Re: Merging data sets

Hi Ballard

I didn't think of using proc sql, but will give it a shot.

Paul

New Contributor
Posts: 3

Re: Merging data sets

Hi Paul,

Below code will give you the desired output, exactly matching to your Expected results -


data s6Final72;

merge s6Freq72Denominator s6Freq72Numerator(in=a) ;

by cnty_name startyear;
if a;
run;

Regards

Regular Contributor
Posts: 216

Re: Merging data sets

Hi Kij

I actually tried that and it comes close, however for some reason the '6' value of exitMonthCategory in s6Freq72Numerator gets renamed to the exitMonthCategory in s6Freq72Denominator and the CumulativeNumber is set to the Denominator in s6Freq72Denominator for the given county/year combo. If I can't find any proper way to do this, then I was going to do what you have below and rename all the '6' values back to their correct records.

Paul

Solution
‎02-21-2013 12:11 PM
PROC Star
Posts: 7,363

Re: Merging data sets

Paul,

The following worked for me:

data s6Final72;

  merge  s6Freq72Denominator (drop=exitMonthCategory) s6Freq72Numerator (in=a);

  by cnty_name startyear;

  if a;

run;

Occasional Contributor
Posts: 7

Re: Merging data sets

The important feature of this and kij23's reply is that the denominator file precedes the numerator file. When that's the case, the drop= option should not be necessary, I think ... but it's always cleaner.

Regular Contributor
Posts: 216

Re: Merging data sets

Hi Art

The second one worked perfectly. Thanks a lot. I guess I originally missed the drop statement then.

Paul

PROC Star
Posts: 7,363

Re: Merging data sets

Paul,

As Dan pointed out, the drop statement really wasn't even necessary.

The problem you had was (1) using the denominator file as the 2nd rather than the first file and (2) not using an in= option to exclude irrelevant records from the denominator file.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 482 views
  • 0 likes
  • 5 in conversation