BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Paul_NYS
Obsidian | Level 7

Hi

I would like to merge two data sets with the intent of just adding one column (Denominator) to the "s6Freq72Numerator" data set. I thought I was able to easily do this in the past, but maybe not. The code I thought would be successful in doing this is below. However, when I use this, the resulting data set is not sorted by exitMonthCategory and and there are missing records from "s6Freq72Numerator".

I am attaching a spreadsheet that has an example of the data from the "s6Freq72Numerator" and "s6Freq72Denominator" data sets and what I was intending as the result.

If anyone has any ideas, please let me know.

Paul

proc sort data=s6Freq72Numerator;

by cnty_name startyear exitMonthCategory;

run;

proc sort data=s6Freq72Denominator;

by cnty_name startyear exitMonthCategory;

run;

data s6Final72;

merge s6Freq72Numerator s6Freq72Denominator;

by cnty_name startyear;

run;

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

Paul,

The following worked for me:

data s6Final72;

  merge  s6Freq72Denominator (drop=exitMonthCategory) s6Freq72Numerator (in=a);

  by cnty_name startyear;

  if a;

run;

View solution in original post

10 REPLIES 10
art297
Opal | Level 21

Paul,

Without looking at your data, why don't you just try adding exitMonthCategory at the end of the merge by statement?

Paul_NYS
Obsidian | Level 7

Hi Art

Because it will only print the Denominator to the one value that matches on the exitMonthCategory value. There are other values of exitMonthCategory in s6Freq72Numerator that will be blank. I tried that already actually.

Paul

ballardw
Super User

Since both of your input data sets are sorted by cnty_name startyear and exitMonthCategory the output of the merged data set is sorted on the same variables. If you need the output sorted by exitMonthCategory you need another sort after the merge.

You probably have missing values for your numerator variable because the combination of the sort variables didn't exist in the numerator data set but did exist in the denominator set. Or you have multiples.

Alternative approach might be

(This does not require sorting the data first)

proc sql;

     create table s6Final72 as

     select a.*, b.s6Freq72Numerator

     froms 6Freq72Denominator as a left joint s6Freq72Numerator as b on

          a.cnty_name=b.cnty_name and a.startyear=b.startyear and a.exitMonthCategory=b.exitMonthCategory

    order by a.exitMonthCategory;

quit;

Paul_NYS
Obsidian | Level 7

Hi Ballard

I didn't think of using proc sql, but will give it a shot.

Paul

kij23
Calcite | Level 5

Hi Paul,

Below code will give you the desired output, exactly matching to your Expected results -


data s6Final72;

merge s6Freq72Denominator s6Freq72Numerator(in=a) ;

by cnty_name startyear;
if a;
run;

Regards

Paul_NYS
Obsidian | Level 7

Hi Kij

I actually tried that and it comes close, however for some reason the '6' value of exitMonthCategory in s6Freq72Numerator gets renamed to the exitMonthCategory in s6Freq72Denominator and the CumulativeNumber is set to the Denominator in s6Freq72Denominator for the given county/year combo. If I can't find any proper way to do this, then I was going to do what you have below and rename all the '6' values back to their correct records.

Paul

art297
Opal | Level 21

Paul,

The following worked for me:

data s6Final72;

  merge  s6Freq72Denominator (drop=exitMonthCategory) s6Freq72Numerator (in=a);

  by cnty_name startyear;

  if a;

run;

DanWALDO
Calcite | Level 5

The important feature of this and kij23's reply is that the denominator file precedes the numerator file. When that's the case, the drop= option should not be necessary, I think ... but it's always cleaner.

Paul_NYS
Obsidian | Level 7

Hi Art

The second one worked perfectly. Thanks a lot. I guess I originally missed the drop statement then.

Paul

art297
Opal | Level 21

Paul,

As Dan pointed out, the drop statement really wasn't even necessary.

The problem you had was (1) using the denominator file as the 2nd rather than the first file and (2) not using an in= option to exclude irrelevant records from the denominator file.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1376 views
  • 0 likes
  • 5 in conversation