BookmarkSubscribeRSS Feed
aycarus
Calcite | Level 5
I have been trying to think of a solution to this for awhile, but I have had no luck. I have a list of customers (with Customer ID and Possessions, which is a semi-colon delimited list as a string) and a list of objects (as strings), as follows:

customers:
Customer ID# Possessions
------------------ ----------------
1 'apples; oranges'
2 'pears'
3 'pears; cheese'

objects:
Object
---------
'apples'
'pineapples'
'cheese'

I want to find all customers whose Possessions can also be found in the list of Objects (only customers 1 and 3 in this case, since the first has 'apples' whereas the third has 'cheese'. Customer 2 has nothing which is in the list of objects). What is the best way to do this in SAS?

I have tried using a SQL statement that looks like

proc sql;
create table customer_pt as select *
from customers as a, objects as b
where a.Possessions like ('%' + b.Object + '%');
quit;

but that just produces errors...

Any help would be appreciated...
7 REPLIES 7
LinusH
Tourmaline | Level 20
I am certain that there are many ways to do this. On way I would try is first to split up your Possessions column so you will have each possession in separate columns, then transpose that table, so that you will have just one possession per row. After doing that, just do an inner join to the Objects table.

/Linus
Data never sleeps
aycarus
Calcite | Level 5
One additional confounding problem with this is that I want to also detect partial matches. Namely, if one customer has "cheese slices" for Possessions then I want that to also be flagged. I can't imagine it being possible to do this any other way than per-row string searching of some kind.
Cynthia_sas
SAS Super FREQ
Hi:
You might try the FIND function in your SQL WHERE clause:
[pre]
where find(a.possessions, trim(b.object)) gt 0;
[/pre]

cynthia
NickR
Quartz | Level 8
little more in detail...

proc sql;
create table both as select * from customer as a, object as b where index(possessions, trim(compress(object, "'")));
quit;
Cynthia_sas
SAS Super FREQ
Hi:
If you use the DSD option when you read the data, there are no quotes in the string that need to be compressed out.

cynthia

[pre]
data customers(keep=cust_id possession) ;
length cust_id 8 possession $100;
infile datalines dsd dlm=" ";
input cust_id $ possession $;
output customers;
return;
datalines4;
1 'apples; oranges'
2 'pears'
3 'pears; cheese'
4 'cheese slices'
5 'oranges; pears'
6 'cheeses tray'
;;;;
run;

title;
proc print data=customers;
title 'Original record';
run;

data objects(keep=object);
length object $20;

infile datalines dsd;
input object $;
output objects;
return;
datalines;
'apples'
'pineapples'
'cheese'
;
run;

proc print data=objects;
title 'original list of objects';
run;

proc sql;
create table sql_find as
select *
from customers as a, objects as b
where find(a.possession, trim(b.object)) gt 0;
quit;

proc print data=sql_find;
title 'After FIND in SQL';
run;

[/pre]
aycarus
Calcite | Level 5
Nice! I think it works. Thanks!
NickR
Quartz | Level 8
nice...you are tooo good...

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 659 views
  • 0 likes
  • 4 in conversation