BookmarkSubscribeRSS Feed
sweetpeaindeed
Calcite | Level 5
I'm trying to merge two data sets based on id and admission date. If you see below, I merged medicare_final with medicare_proc_first for beneficiary ids that are in both data sets. However, I also only want the observations from medicare_final to merge with medicare_proc_first where admission_date in medicare_final is greater than the admission date in medicare_proc_first and only within 365 days after the admission_date in medicare_proc_first. Can I do this??
Any help is greatly appreciated; I have no idea how to accomplish this.

data medicare_proc_all;
merge medicare_proc_first (IN=M1) medicare_final (IN = M2);
by beneficiary_id;
if m1 and m2;
run;
11 REPLIES 11
Doc_Duke
Rhodochrosite | Level 12
It is straightforward using SQL.

PROC SQL;
CREATE TABLE medicare_proc_all AS
SELECT *
FROM medicare_proc_first AS p,
medicare_final AS f
WHERE
p.beneficiary_id=f.beneficiary_id AND
f.admission_date BETWEEN p.admission_date+1 and P.admission_date+365
;

You may want to rename one of the admission dates so you keep both in the final dataset.
SPR
Quartz | Level 8 SPR
Quartz | Level 8
Hello SweetPeaIndeed,

It is better to use proc SQL in this case like this:
[pre]
proc SQL;
create table medicare_proc_all as
select a.*, b.* /* list fields you need from both tables */
from medicare_proc_first as a, medicare_final as b
where a.beneficiary_id=b.beneficiary_id and
b.admission_date-365 LE a.admission_date LT b.admission_date
;quit;
[/pre]
Since both datasets contain the same variables like beneficiary_id, admission_date and may be some others it is necessary to list all necessary variables in the select clause.

Sincerely,
SPR
sweetpeaindeed
Calcite | Level 5
Thank you! However, it doesn't seem to be giving me what I am looking for. When I use either of the sql statements it only gives me the first observation for each beneficiary_id, which is not what I am trying to get. This makes me think I am explaining myself poorly. The merge in a one-to-many where beneficiary_ids match. So, in medicare_proc, I have one observation for a beneficiary_id but that beneficiary_id has many matches in medicare_final. I want those additional observations in medicare_final but ONLY if the beneficiary_ids match and the admission_date in medicare_final is greater than within 365 days of the admission_date in medicare_proc.

For example:

medicare_proc medicare_final medicare_all

id admission_date id admission_date id admission_date
1 03MAR2003 1 03FEB2003 1 03MAR2003
1 03MAR2003 1 03MAY2003
1 03MAY2003

Merging medicare_proc and medicare_final would give me medicare_all.
Hopefully that makes more sense. Clearly neither of my sas or sql skills are great so I appreciate the help!
SPR
Quartz | Level 8 SPR
Quartz | Level 8
I am completely confused with the information you submit. In your last post you say:
"I want those additional observations in medicare_final but ONLY if the beneficiary_ids match and the admission_date in medicare_final is greater than within 365 days of the admission_date in medicare_proc"

1) It means that admission_date(final) > admission_date(proc) + 365.

In you example you wrote:
[pre]
medicare_proc medicare_final medicare_all
id admission_date id admission_date id admission_date
1 03MAR2003 1 03FEB2003 1 03MAR2003
[/pre]
According to 1) it should be
admission date(final)=03Feb2003 > admission_date(proc)=03Mar2003 + 365=03Mar2003

which is an obvious error. Please, explain.
SPR
sweetpeaindeed
Calcite | Level 5
Thank you both! However, it doesn't seem to be giving me what I am looking for. When I use either of the sql statements it only gives me the first observation for each beneficiary_id, which is not what I am trying to get. This makes me think I am explaining myself poorly. The merge in a one-to-many where beneficiary_ids match. So, in medicare_proc, I have one observation for a beneficiary_id but that beneficiary_id has many matches in medicare_final. I want those additional observations in medicare_final but ONLY if the beneficiary_ids match and the admission_date in medicare_final is greater than within 365 days of the admission_date in medicare_proc.

For example:

medicare_proc medicare_final medicare_all

id admission_date id admission_date id admission_date
1 03MAR2003 1 03FEB2003 1 03MAR2003
1 03MAR2003 1 03MAY2003
1 03MAY2003

Merging medicare_proc and medicare_final would give me medicare_all
Ksharp
Super User
Or Maybe you can add another condition to filter some observations.


[pre]

data medicare_proc_all;
merge medicare_proc_first (IN=M1) medicare_final (IN = M2 rename=(admission_date=_admission_date));
by beneficiary_id;
if m1 and m2 and _admission_date le admission_date+365;
run;
[/pre]

Ksharp
sweetpeaindeed
Calcite | Level 5
If figured this one out yesterday (surprisingly) using this:

PROC SQL;
CREATE TABLE alldata2 AS
SELECT a.*, b.*
FROM medicare_proc_first a
LEFT JOIN
medicare_final_1 b
ON a.beneficiary_id=b.beneficiary_id_1
where b.admission_date_1 BETWEEN a.admission_date and a.admission_date+365;
QUIT;
RUN;

This worked perfectly for me once I made sure the 2 data sets had different variable names. However, today I am trying to do a left join with medicare_proc_first with a new modified medicare_final_1. There are 13098 unique beneficiary_ids in medicare_proc and from what I understand, with a left join I should see 13098 unique beneficiary ids in my new data set alldata2. However, after creating the new table I only see 12748 unique beneficiary ids. Any idea why this would be happening?
Ksharp
Super User
Yes.
I think that is because you add a "where" clause.
So it will filter some obs.
You can test it without this "where" clause, to see whether has 13098 unique .

But the best way to find answer is to post some dummy data and code you used,then can check it further.

Ksharp
Howles
Quartz | Level 8
Eliminate the WHERE clause, but move the condition into the ON clause (connected with AND). That's probably what's desired.

Outer unions are guided by their ON clauses. Any accompanying WHERE clause operates as a simple filter and does not generate any "outer" rows.

With inner unions, it doesn't matter if some of the conditions are coded in a WHERE clause. With outer unions it does.

Here's an example.

[pre]
data table_L ;
input id_L $ date_L ;
cards ;
x 1
y 2
z 3
;

data table_R ;
input id_R $ date_R amount ;
cards ;
x 1 1.1
y 2 2.2
;

PROC SQL ;
TITLE 'All Conditions in ON Clause' ;
SELECT *
FROM table_L LEFT JOIN table_R
ON id_L EQ id_R
AND date_L EQ date_R
;
TITLE 'Some Conditions in WHERE Clause' ;
SELECT *
FROM table_L LEFT JOIN table_R
ON id_L EQ id_R
WHERE date_L EQ date_R
;
TITLE ;
QUIT ;
[/pre]

The first query returns

[pre]
id_L date_L id_R date_R amount
------------------------------------------------
x 1 x 1 1.1
y 2 y 2 2.2
z 3 . .
[/pre]

but the second returns

[pre]
id_L date_L id_R date_R amount
------------------------------------------------
x 1 x 1 1.1
y 2 y 2 2.2
[/pre]

> Yes.
> I think that is because you add a "where" clause.
> So it will filter some obs.
> You can test it without this "where" clause, to see
> whether has 13098 unique .
>
> But the best way to find answer is to post some dummy
> data and code you used,then can check it further.
>
> Ksharp
Ksharp
Super User
Hi.Howles
Thanks. let me know there is different between

ON id_L EQ id_R
AND date_L EQ date_R

and

ON id_L EQ id_R
WHERE date_L EQ date_R


I think they are the same before.


Ksharp
Ksharp
Super User
And "union" you mentioned would be join (I think), union in sql is an other different operator from join .


Ksharp

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 9449 views
  • 0 likes
  • 5 in conversation