Help using Base SAS procedures

merging based on additional criteria

Reply
Occasional Contributor
Posts: 8

merging based on additional criteria

I'm trying to merge two data sets based on id and admission date. If you see below, I merged medicare_final with medicare_proc_first for beneficiary ids that are in both data sets. However, I also only want the observations from medicare_final to merge with medicare_proc_first where admission_date in medicare_final is greater than the admission date in medicare_proc_first and only within 365 days after the admission_date in medicare_proc_first. Can I do this??
Any help is greatly appreciated; I have no idea how to accomplish this.

data medicare_proc_all;
merge medicare_proc_first (IN=M1) medicare_final (IN = M2);
by beneficiary_id;
if m1 and m2;
run;
Trusted Advisor
Posts: 2,113

Re: merging based on additional criteria

It is straightforward using SQL.

PROC SQL;
CREATE TABLE medicare_proc_all AS
SELECT *
FROM medicare_proc_first AS p,
medicare_final AS f
WHERE
p.beneficiary_id=f.beneficiary_id AND
f.admission_date BETWEEN p.admission_date+1 and P.admission_date+365
;

You may want to rename one of the admission dates so you keep both in the final dataset.
Super Contributor
Super Contributor
Posts: 365

Re: merging based on additional criteria

Hello SweetPeaIndeed,

It is better to use proc SQL in this case like this:
[pre]
proc SQL;
create table medicare_proc_all as
select a.*, b.* /* list fields you need from both tables */
from medicare_proc_first as a, medicare_final as b
where a.beneficiary_id=b.beneficiary_id and
b.admission_date-365 LE a.admission_date LT b.admission_date
;quit;
[/pre]
Since both datasets contain the same variables like beneficiary_id, admission_date and may be some others it is necessary to list all necessary variables in the select clause.

Sincerely,
SPR
Occasional Contributor
Posts: 8

Re: merging based on additional criteria

Thank you! However, it doesn't seem to be giving me what I am looking for. When I use either of the sql statements it only gives me the first observation for each beneficiary_id, which is not what I am trying to get. This makes me think I am explaining myself poorly. The merge in a one-to-many where beneficiary_ids match. So, in medicare_proc, I have one observation for a beneficiary_id but that beneficiary_id has many matches in medicare_final. I want those additional observations in medicare_final but ONLY if the beneficiary_ids match and the admission_date in medicare_final is greater than within 365 days of the admission_date in medicare_proc.

For example:

medicare_proc medicare_final medicare_all

id admission_date id admission_date id admission_date
1 03MAR2003 1 03FEB2003 1 03MAR2003
1 03MAR2003 1 03MAY2003
1 03MAY2003

Merging medicare_proc and medicare_final would give me medicare_all.
Hopefully that makes more sense. Clearly neither of my sas or sql skills are great so I appreciate the help!
Super Contributor
Super Contributor
Posts: 365

Re: merging based on additional criteria

I am completely confused with the information you submit. In your last post you say:
"I want those additional observations in medicare_final but ONLY if the beneficiary_ids match and the admission_date in medicare_final is greater than within 365 days of the admission_date in medicare_proc"

1) It means that admission_date(final) > admission_date(proc) + 365.

In you example you wrote:
[pre]
medicare_proc medicare_final medicare_all
id admission_date id admission_date id admission_date
1 03MAR2003 1 03FEB2003 1 03MAR2003
[/pre]
According to 1) it should be
admission date(final)=03Feb2003 > admission_date(proc)=03Mar2003 + 365=03Mar2003

which is an obvious error. Please, explain.
SPR
Occasional Contributor
Posts: 8

Re: merging based on additional criteria

Thank you both! However, it doesn't seem to be giving me what I am looking for. When I use either of the sql statements it only gives me the first observation for each beneficiary_id, which is not what I am trying to get. This makes me think I am explaining myself poorly. The merge in a one-to-many where beneficiary_ids match. So, in medicare_proc, I have one observation for a beneficiary_id but that beneficiary_id has many matches in medicare_final. I want those additional observations in medicare_final but ONLY if the beneficiary_ids match and the admission_date in medicare_final is greater than within 365 days of the admission_date in medicare_proc.

For example:

medicare_proc medicare_final medicare_all

id admission_date id admission_date id admission_date
1 03MAR2003 1 03FEB2003 1 03MAR2003
1 03MAR2003 1 03MAY2003
1 03MAY2003

Merging medicare_proc and medicare_final would give me medicare_all
Super User
Posts: 9,681

Re: merging based on additional criteria

Or Maybe you can add another condition to filter some observations.


[pre]

data medicare_proc_all;
merge medicare_proc_first (IN=M1) medicare_final (IN = M2 rename=(admission_date=_admission_date));
by beneficiary_id;
if m1 and m2 and _admission_date le admission_date+365;
run;
[/pre]

Ksharp
Occasional Contributor
Posts: 8

Re: merging based on additional criteria

If figured this one out yesterday (surprisingly) using this:

PROC SQL;
CREATE TABLE alldata2 AS
SELECT a.*, b.*
FROM medicare_proc_first a
LEFT JOIN
medicare_final_1 b
ON a.beneficiary_id=b.beneficiary_id_1
where b.admission_date_1 BETWEEN a.admission_date and a.admission_date+365;
QUIT;
RUN;

This worked perfectly for me once I made sure the 2 data sets had different variable names. However, today I am trying to do a left join with medicare_proc_first with a new modified medicare_final_1. There are 13098 unique beneficiary_ids in medicare_proc and from what I understand, with a left join I should see 13098 unique beneficiary ids in my new data set alldata2. However, after creating the new table I only see 12748 unique beneficiary ids. Any idea why this would be happening?
Super User
Posts: 9,681

Re: merging based on additional criteria

Yes.
I think that is because you add a "where" clause.
So it will filter some obs.
You can test it without this "where" clause, to see whether has 13098 unique .

But the best way to find answer is to post some dummy data and code you used,then can check it further.

Ksharp
Regular Contributor
Posts: 184

Re: merging based on additional criteria

Eliminate the WHERE clause, but move the condition into the ON clause (connected with AND). That's probably what's desired.

Outer unions are guided by their ON clauses. Any accompanying WHERE clause operates as a simple filter and does not generate any "outer" rows.

With inner unions, it doesn't matter if some of the conditions are coded in a WHERE clause. With outer unions it does.

Here's an example.

[pre]
data table_L ;
input id_L $ date_L ;
cards ;
x 1
y 2
z 3
;

data table_R ;
input id_R $ date_R amount ;
cards ;
x 1 1.1
y 2 2.2
;

PROC SQL ;
TITLE 'All Conditions in ON Clause' ;
SELECT *
FROM table_L LEFT JOIN table_R
ON id_L EQ id_R
AND date_L EQ date_R
;
TITLE 'Some Conditions in WHERE Clause' ;
SELECT *
FROM table_L LEFT JOIN table_R
ON id_L EQ id_R
WHERE date_L EQ date_R
;
TITLE ;
QUIT ;
[/pre]

The first query returns

[pre]
id_L date_L id_R date_R amount
------------------------------------------------
x 1 x 1 1.1
y 2 y 2 2.2
z 3 . .
[/pre]

but the second returns

[pre]
id_L date_L id_R date_R amount
------------------------------------------------
x 1 x 1 1.1
y 2 y 2 2.2
[/pre]

> Yes.
> I think that is because you add a "where" clause.
> So it will filter some obs.
> You can test it without this "where" clause, to see
> whether has 13098 unique .
>
> But the best way to find answer is to post some dummy
> data and code you used,then can check it further.
>
> Ksharp
Super User
Posts: 9,681

Re: merging based on additional criteria

Hi.Howles
Thanks. let me know there is different between

ON id_L EQ id_R
AND date_L EQ date_R

and

ON id_L EQ id_R
WHERE date_L EQ date_R


I think they are the same before.


Ksharp
Super User
Posts: 9,681

Re: merging based on additional criteria

And "union" you mentioned would be join (I think), union in sql is an other different operator from join .


Ksharp
Ask a Question
Discussion stats
  • 11 replies
  • 1017 views
  • 0 likes
  • 5 in conversation