Hi
I would like to merge two data sets with the intent of just adding one column (Denominator) to the "s6Freq72Numerator" data set. I thought I was able to easily do this in the past, but maybe not. The code I thought would be successful in doing this is below. However, when I use this, the resulting data set is not sorted by exitMonthCategory and and there are missing records from "s6Freq72Numerator".
I am attaching a spreadsheet that has an example of the data from the "s6Freq72Numerator" and "s6Freq72Denominator" data sets and what I was intending as the result.
If anyone has any ideas, please let me know.
Paul
proc sort data=s6Freq72Numerator;
by cnty_name startyear exitMonthCategory;
run;
proc sort data=s6Freq72Denominator;
by cnty_name startyear exitMonthCategory;
run;
data s6Final72;
merge s6Freq72Numerator s6Freq72Denominator;
by cnty_name startyear;
run;
Paul,
The following worked for me:
data s6Final72;
merge s6Freq72Denominator (drop=exitMonthCategory) s6Freq72Numerator (in=a);
by cnty_name startyear;
if a;
run;
Paul,
Without looking at your data, why don't you just try adding exitMonthCategory at the end of the merge by statement?
Hi Art
Because it will only print the Denominator to the one value that matches on the exitMonthCategory value. There are other values of exitMonthCategory in s6Freq72Numerator that will be blank. I tried that already actually.
Paul
Since both of your input data sets are sorted by cnty_name startyear and exitMonthCategory the output of the merged data set is sorted on the same variables. If you need the output sorted by exitMonthCategory you need another sort after the merge.
You probably have missing values for your numerator variable because the combination of the sort variables didn't exist in the numerator data set but did exist in the denominator set. Or you have multiples.
Alternative approach might be
(This does not require sorting the data first)
proc sql;
create table s6Final72 as
select a.*, b.s6Freq72Numerator
froms 6Freq72Denominator as a left joint s6Freq72Numerator as b on
a.cnty_name=b.cnty_name and a.startyear=b.startyear and a.exitMonthCategory=b.exitMonthCategory
order by a.exitMonthCategory;
quit;
Hi Ballard
I didn't think of using proc sql, but will give it a shot.
Paul
Hi Paul,
Below code will give you the desired output, exactly matching to your Expected results -
data s6Final72;
merge s6Freq72Denominator s6Freq72Numerator(in=a) ;
by cnty_name startyear;
if a;
run;
Regards
Hi Kij
I actually tried that and it comes close, however for some reason the '6' value of exitMonthCategory in s6Freq72Numerator gets renamed to the exitMonthCategory in s6Freq72Denominator and the CumulativeNumber is set to the Denominator in s6Freq72Denominator for the given county/year combo. If I can't find any proper way to do this, then I was going to do what you have below and rename all the '6' values back to their correct records.
Paul
Paul,
The following worked for me:
data s6Final72;
merge s6Freq72Denominator (drop=exitMonthCategory) s6Freq72Numerator (in=a);
by cnty_name startyear;
if a;
run;
The important feature of this and kij23's reply is that the denominator file precedes the numerator file. When that's the case, the drop= option should not be necessary, I think ... but it's always cleaner.
Hi Art
The second one worked perfectly. Thanks a lot. I guess I originally missed the drop statement then.
Paul
Paul,
As Dan pointed out, the drop statement really wasn't even necessary.
The problem you had was (1) using the denominator file as the 2nd rather than the first file and (2) not using an in= option to exclude irrelevant records from the denominator file.
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.