BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jwestra84
Fluorite | Level 6

Hello,

 

I am having a little trouble merging two datasets together. 

Dataset A:

ID Date x1 x2 x3

 

Dataset B:

ID Date y1

 

Both datasets are one record per date, but A will have 5 dates per ID, whereas B will have anywhere from 0 to 6 dates per ID. Not every ID in A will have records in B.

 

When I first tried this, it was merging correctly for the first occurance per id, but afterwards would just add the records from B as if I were interleaving the datasets. I used the following code:

 

data AB; merge A B;

by ID date;

run;

 

The resulting dataset looks like this:

 

ID Date x1 x2 x3 y1

1   1      1   0   1

1   2      0   0   1   1

1   3      0   1   1   

1   4      1   1   0

1   5      0   1   1

1   5                     2

 

So set A has all five dates and set B has records at dates 2 and 5. I would like it so that the two records with ID 1 and date 5 were the same record.

Any help would be appreciated!

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

I suspect that your dates don't really match all the time. Try rounding the dates before joining :

 

proc sql;
create table AB as
select a.*, b.y
from A as a left join B as b
on a.ID=b.ID and round(a.date)=round(b.date);
quit;
PG

View solution in original post

6 REPLIES 6
Reeza
Super User

Many to many merges aren't what you expect in a data step. Using SQL in this case is recommended.

You show what you're getting, but not what you expect ...what do you expect as results?

http://support.sas.com/kb/24/752.html

jwestra84
Fluorite | Level 6

I am expecting to get this:

 

ID Date x1 x2 x3 y1

1   1      1   0   1

1   2      0   0   1   1

1   3      0   1   1   

1   4      1   1   0

1   5      0   1   1   2

 

There should only be one record per date, but the data step results in there being an extra record for the dates that are also in B.

 

I have also tried in SQL, but have gotten similar results.

 

proc sql;
create table AB as
select *
from A as a left join B as b
on a.ID=b.cID and a.date=b.date;
quit;

PGStats
Opal | Level 21

I suspect that your dates don't really match all the time. Try rounding the dates before joining :

 

proc sql;
create table AB as
select a.*, b.y
from A as a left join B as b
on a.ID=b.ID and round(a.date)=round(b.date);
quit;
PG
jwestra84
Fluorite | Level 6

Thank you for calling my attention to this. Turns out this was the problem.

 

Thank you!

AnnaBrown
Community Manager

I'm glad you were able to identify the problem, jwestra84! If one of the replies was the exact solution to your problem, can you "Accept it as a solution"? Or if one was particularly helpful, feel free to "Like" it. This will help other community members who may run into the same issue know what worked.

Thanks!
Anna

Astounding
PROC Star

You will need to check your data.  The results you describe do not match the program you showed.  The program would perform exactly as you hoped and expected.  These results correspond to data set B having 2 observations for DATE=5.  On the first of those, Y1 is missing and on the second, Y1 is 2.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 2786 views
  • 0 likes
  • 5 in conversation