Help using Base SAS procedures

Advanced string comparison in a row

Reply
New Contributor
Posts: 3

Advanced string comparison in a row

I have been trying to think of a solution to this for awhile, but I have had no luck. I have a list of customers (with Customer ID and Possessions, which is a semi-colon delimited list as a string) and a list of objects (as strings), as follows:

customers:
Customer ID# Possessions
------------------ ----------------
1 'apples; oranges'
2 'pears'
3 'pears; cheese'

objects:
Object
---------
'apples'
'pineapples'
'cheese'

I want to find all customers whose Possessions can also be found in the list of Objects (only customers 1 and 3 in this case, since the first has 'apples' whereas the third has 'cheese'. Customer 2 has nothing which is in the list of objects). What is the best way to do this in SAS?

I have tried using a SQL statement that looks like

proc sql;
create table customer_pt as select *
from customers as a, objects as b
where a.Possessions like ('%' + b.Object + '%');
quit;

but that just produces errors...

Any help would be appreciated...
Super User
Posts: 5,257

Re: Advanced string comparison in a row

I am certain that there are many ways to do this. On way I would try is first to split up your Possessions column so you will have each possession in separate columns, then transpose that table, so that you will have just one possession per row. After doing that, just do an inner join to the Objects table.

/Linus
Data never sleeps
New Contributor
Posts: 3

Re: Advanced string comparison in a row

One additional confounding problem with this is that I want to also detect partial matches. Namely, if one customer has "cheese slices" for Possessions then I want that to also be flagged. I can't imagine it being possible to do this any other way than per-row string searching of some kind.
SAS Super FREQ
Posts: 8,743

Re: Advanced string comparison in a row

Hi:
You might try the FIND function in your SQL WHERE clause:
[pre]
where find(a.possessions, trim(b.object)) gt 0;
[/pre]

cynthia
Frequent Contributor
Posts: 81

Re: Advanced string comparison in a row

little more in detail...

proc sql;
create table both as select * from customer as a, object as b where index(possessions, trim(compress(object, "'")));
quit;
SAS Super FREQ
Posts: 8,743

Re: Advanced string comparison in a row

Hi:
If you use the DSD option when you read the data, there are no quotes in the string that need to be compressed out.

cynthia

[pre]
data customers(keep=cust_id possession) ;
length cust_id 8 possession $100;
infile datalines dsd dlm=" ";
input cust_id $ possession $;
output customers;
return;
datalines4;
1 'apples; oranges'
2 'pears'
3 'pears; cheese'
4 'cheese slices'
5 'oranges; pears'
6 'cheeses tray'
;;;;
run;

title;
proc print data=customers;
title 'Original record';
run;

data objects(keep=object);
length object $20;

infile datalines dsd;
input object $;
output objects;
return;
datalines;
'apples'
'pineapples'
'cheese'
;
run;

proc print data=objects;
title 'original list of objects';
run;

proc sql;
create table sql_find as
select *
from customers as a, objects as b
where find(a.possession, trim(b.object)) gt 0;
quit;

proc print data=sql_find;
title 'After FIND in SQL';
run;

[/pre]
New Contributor
Posts: 3

Re: Advanced string comparison in a row

Nice! I think it works. Thanks!
Frequent Contributor
Posts: 81

Re: Advanced string comparison in a row

nice...you are tooo good...
Ask a Question
Discussion stats
  • 7 replies
  • 101 views
  • 0 likes
  • 4 in conversation