Solved: Re: How to Match Observations from Two Variables

yawenyu · Posted 05-08-2017 01:57 PM

Hello!

I am using SAS ver9.3 and I am attempting to match observations from two different variables: Subject1 and Subject2. Subject2 doesn't have as many observations as Subject1 due to necessary duplicates and that is causing the issue. When these two variables are in the same dataset using a simple merge, it ends up looking like this:

Subject1 Subject2

A-0001-01 0001-01

A-0001-01 0002-02

A-0001-01 0003-03

B-0002-02

C-0003-03

Instead, this is what I'm interested in:

Subject1 Subject2

A-0001-01 0001-01

A-0001-01

B-0002-02 0002-02

B-0002-02

C-0003-03 0003-03

C-0003-03

I am new to SAS programming, so I apologize if there might be a simple fix that I'm not realizing. Thank you!

CliftonDenning · Posted 05-08-2017 09:55 PM

Well I was assuming you wanted to combine them based on a key, but if they are the same size and you just want them to be in the same dataset in the same order:

data combined;

merge subject1 subject2;

run;

Should do it. However if you want to join them with the pseudo key I mentioned above, maybe with additional values, the values would all be populated, unless you create a unique key that you can then join on.

Alternatlvely if you are pressed for time you can use a pretty inefficient “first”

Data final;

Set joined;

By subject1;

If not first.subject1 then subject2 = “”;

run;

That would set them all to null.

To set a key you could just run this on subject1:

Data subject1;

Set subject1;

By subject1;

If first.subject1 then key = substr(subject1, 3,7);

run;

Then just join on the new key value.

View solution in original post

CliftonDenning · Posted 05-08-2017 02:17 PM

Using a proc sql join you could pull the two together fairly simply. Essentially you need a standard key between the two to join on. Proc sql lets you cheat a bit using an on statement. Otherwise you'd need to create something two two can join on. One additional benefit is that joins don't require a sort compared to a merge.

proc sql; create table joined as select a.*, b.* from subject1 a left join subject2 b
on substr(a.a, 3,7) = b.b;

Which would get you:

Subject1	Subject2
A-0001-01	0001-01
A-0001-01	0001-01
A-0001-01	0001-01
B-0002-02	0002-02
B-0002-02	0002-02
C-0003-03	0003-03
C-0003-03	0003-03
C-0003-03	0003-03
C-0003-03	0003-03

yawenyu · Posted 05-08-2017 05:38 PM

Hi Clifton, thank you for your reply. I went ahead and tried the code, but is there a way to keep the blank obervations? Like this:

Subject1	Subject2
A-0001-01	0001-01
A-0001-01
A-0001-01
B-0002-02	0002-02
B-0002-02
C-0003-03	0003-03
C-0003-03
C-0003-03
C-0003-03

CliftonDenning · Posted 05-08-2017 09:55 PM

Well I was assuming you wanted to combine them based on a key, but if they are the same size and you just want them to be in the same dataset in the same order:

data combined;

merge subject1 subject2;

run;

Should do it. However if you want to join them with the pseudo key I mentioned above, maybe with additional values, the values would all be populated, unless you create a unique key that you can then join on.

Alternatlvely if you are pressed for time you can use a pretty inefficient “first”

Data final;

Set joined;

By subject1;

If not first.subject1 then subject2 = “”;

run;

That would set them all to null.

To set a key you could just run this on subject1:

Data subject1;

Set subject1;

By subject1;

If first.subject1 then key = substr(subject1, 3,7);

run;

Then just join on the new key value.

How to Match Observations from Two Variables

Re: How to Match Observations from Two Variables

Re: How to Match Observations from Two Variables

Re: How to Match Observations from Two Variables

Re: How to Match Observations from Two Variables

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away