SAS Programming

Paul_NYS · Posted 10-23-2012 03:59 PM

Hi

I am trying to merge two SAS datasets with the below data step. However, I would like to keep all the records of the issueJoined data set and not just only those that have a matching record in the issuejoinedevents data set. Is this possible?

Right, I am not getting any records in the resulting data set. However, when I run a proc SQL server and join the same two data sets through the same column, docket_id, I am getting records.

Paul

data s10;

merge work.issueJoined work.issuejoinedevents (keep=edate rename=(edate=issueJoinDate));

by docket_id;

run;

art297 · Posted 10-23-2012 05:56 PM

One way using proc sort would be to use two sorts. i.e.,

proc sort data=work.issuejoinedevents;

by docket_id edate;

run;

which would order the file by edates within docket_id, following which you could keep only the earliest date with another sort:

proc sort data=work.issuejoinedevents nodupkey;

by docket_id;

run;

View solution in original post

Linlin · Posted 10-23-2012 04:12 PM

Hi,

I added the RED parts. Also both datasets have to be sorted by docket_id.

data s10;

merge work.issueJoined(in=a) work.issuejoinedevents (keep=docked_id edate rename=(edate=issueJoinDate));

by docket_id;

if a;

run;

Paul_NYS · Posted 10-23-2012 05:16 PM

H Linlin

I did as you indicated above and revised the code to reflect the below. It seemed to generally work, however, in the original issueJoined, there are 114,744 records and in the resulting s10 data set, there are 115,608 records. I have not completely looked yet, but I am assuming some records from the issuejoinedevents data set were included that do not have a matching record in the issueJoined data set. Is this possible?

Paul

data s10;

merge work.issueJoined(in=a) work.issuejoinedevents (keep=docket_id edate rename=(edate=issueJoinDate));

by docket_id;

if a;

run;

Astounding · Posted 10-23-2012 05:29 PM

Paul,

No, that's not possible using this code. What is happening is that some of the docket_id values in IssueJoinedEvents appear multiple times within that data set. As a result, they appear multiple times in the merged data set as well.

Paul_NYS · Posted 10-23-2012 05:51 PM

That was the second option, thanks. I am sorting both data sets prior to merging. Both sorts are by docket_id. Can I add a second sort by a date column (edate) and then just keep the first instance of edate (where it is lowest)? In the below code, do you know how would I only keep the first (lowest) value of edate?

proc sort data=work.issuejoinedevents;

by docket_id edate;

run;

art297 · Posted 10-23-2012 05:56 PM

One way using proc sort would be to use two sorts. i.e.,

proc sort data=work.issuejoinedevents;

by docket_id edate;

run;

which would order the file by edates within docket_id, following which you could keep only the earliest date with another sort:

proc sort data=work.issuejoinedevents nodupkey;

by docket_id;

run;

Paul_NYS · Posted 10-23-2012 06:36 PM

That seemed to have worked. I found the nodupkey, but adding the second sort just using the docket_id made it work completely. Thanks to everyone.

Paul

SAS Programming

Merging data sets and keeping one set

Re: Merging data sets and keeping one set

Re: Merging data sets and keeping one set

Re: Merging data sets and keeping one set

Re: Merging data sets and keeping one set

Re: Merging data sets and keeping one set

Re: Merging data sets and keeping one set

Re: Merging data sets and keeping one set

Merging Data sets

data set extraction from 2 files

keeping only content within the 2nd set of parentheses

conditional merge of two data sets

help with sas data set

Follow Us

What is...

SAS Programming

Our biggest data and AI event of the year.

SAS Training: Just a Click Away

Follow Us

What is...