BookmarkSubscribeRSS Feed
q5339
Calcite | Level 5

I want to merge a couple columns from dataset B onto dataset A, but only if A.column contains the substring B.column1. If that condition is met, I want to add B.column2 to dataset A.

 

I feel like I should be able to do this in proc sql but I'm not sure how to implement the conditional substring search.

5 REPLIES 5
PeterClemmensen
Tourmaline | Level 20

Welcome to the SAS communities 🙂

 

Sounds like a job that can be done in several ways. Can you be a bit more specific in your requirements? 

 

And can you provide some small example data set and what you want the result to be? Makes it much easier for us to help you 🙂

q5339
Calcite | Level 5

The relevant column in dataset A contains a bunch of serial numbers that are delimited with "|", so something like "12345|23434|098543".

 

For dataset B, column 1 contains single serial numbers (e.g. "12345") and column 2 contains dates.

 

If A.column contains the serial number in B.column1, I want to add B.column2 to A.

 

Conceptually, it's something like: if find(A.column, B.column1) ~= 0 then A left join B.column2

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Post test data in the form of a datastep, see:
https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

 

Tip 1, never keep multiple data items in one variable, its not a good way to store data and will make your life harder.

 

Tip 2, Read the documentation, to find one string in another you could use index() SAS function,

http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000212242.htm

So long as its not pass through SQL that will work fine.

q5339
Calcite | Level 5

Regarding your first tip, I would normally agree. Unfortunately, the nature of the data I'm using requires me to put multiple items in 1 variable.

 

For tip 2, if I can't use index or find or contains within SQL, how could I do this conditional join? I've played around using DATA and MERGE but it doesn't seem to produce the desired result.

 

If you need example data, I suppose this could work

data set1;
input serials:$20.;
datalines;
12345|124|5498
4398|29893|3409|324
3498|0989
;;;

data set2;
input serial:$5. stuff;
datalines;
12345 metadata1
98 metadata2
0989 metadata3
;;;;

The desired output would look like:

serials        stuff

12345|124|5498 metadata1

4398|29893|3409|324

3498|0989 metadata3 

PeterClemmensen
Tourmaline | Level 20

What if set2 also contains row 2 like below?

 

data set2;
input serial:$5. stuff;
datalines;
12345 metadata1
124 metadata2
98 metadata2
0989 metadata3
;;;;

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 2040 views
  • 0 likes
  • 3 in conversation