BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
yawenyu
Obsidian | Level 7

Hello!

 

I am using SAS ver9.3 and I am attempting to match observations from two different variables: Subject1 and Subject2. Subject2 doesn't have as many observations as Subject1 due to necessary duplicates and that is causing the issue. When these two variables are in the same dataset using a simple merge, it ends up looking like this:

 

Subject1               Subject2

A-0001-01              0001-01

A-0001-01              0002-02

A-0001-01              0003-03

B-0002-02

B-0002-02

C-0003-03

C-0003-03

C-0003-03

C-0003-03

 

 

Instead, this is what I'm interested in:

 

Subject1               Subject2

A-0001-01              0001-01

A-0001-01             

A-0001-01             

B-0002-02              0002-02

B-0002-02

C-0003-03              0003-03

C-0003-03

C-0003-03

C-0003-03

 

I am new to SAS programming, so I apologize if there might be a simple fix that I'm not realizing. Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
CliftonDenning
Obsidian | Level 7

Well I was assuming you wanted to combine them based on a key, but if they are the same size and you just want them to be in the same dataset in the same order:

 

data combined;

merge subject1 subject2;

run;

 

Should do it. However if you want to join them with the pseudo key I mentioned above, maybe with additional values, the values would all be populated, unless you create a unique key that you can then join on.

 

Alternatlvely if you are pressed for time you can use a pretty inefficient “first”

 

Data final;

            Set joined;

            By subject1;

            If not first.subject1 then subject2 = “”;

run;

 

That would set them all to null.

To set a key you could just run this on subject1:

 

Data subject1;

            Set subject1;

            By subject1;

            If first.subject1 then key = substr(subject1, 3,7);

run;

 

Then just join on the new key value.

View solution in original post

3 REPLIES 3
CliftonDenning
Obsidian | Level 7

Using a proc sql join you could pull the two together fairly simply. Essentially you need a standard key between the two to join on. Proc sql lets you cheat a bit using an on statement. Otherwise you'd need to create something two two can join on. One additional benefit is that joins don't require a sort compared to a merge. 

 

proc sql; create table joined as select a.*, b.* from subject1 a left join subject2 b
on substr(a.a, 3,7) = b.b;

 

Which would get you:

 

Subject1Subject2
A-0001-010001-01
A-0001-01             0001-01
A-0001-01             0001-01
B-0002-020002-02
B-0002-020002-02
C-0003-030003-03
C-0003-030003-03
C-0003-030003-03
C-0003-030003-03
yawenyu
Obsidian | Level 7

Hi Clifton, thank you for your reply. I went ahead and tried the code, but is there a way to keep the blank obervations? Like this:

 

Subject1Subject2
A-0001-010001-01
A-0001-01              
A-0001-01              
B-0002-020002-02
B-0002-02 
C-0003-030003-03
C-0003-03 
C-0003-03 
C-0003-03 
CliftonDenning
Obsidian | Level 7

Well I was assuming you wanted to combine them based on a key, but if they are the same size and you just want them to be in the same dataset in the same order:

 

data combined;

merge subject1 subject2;

run;

 

Should do it. However if you want to join them with the pseudo key I mentioned above, maybe with additional values, the values would all be populated, unless you create a unique key that you can then join on.

 

Alternatlvely if you are pressed for time you can use a pretty inefficient “first”

 

Data final;

            Set joined;

            By subject1;

            If not first.subject1 then subject2 = “”;

run;

 

That would set them all to null.

To set a key you could just run this on subject1:

 

Data subject1;

            Set subject1;

            By subject1;

            If first.subject1 then key = substr(subject1, 3,7);

run;

 

Then just join on the new key value.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 668 views
  • 2 likes
  • 2 in conversation