I'm trying to write an iterative script in sas that would see assign an id to a customer based on whether they've met conditions in my lookup table. Normally, I would accomplish this with a sql join, but I need a procedural script that would see whether they've met the conditions that exist in the lookup, then assign them that id. Some attributes are not required or available, so a sql join would not work, since the join would require that all conditions be met. See the example, below:
customers table
customer attr1 attr2 attr3 jerry a r g tom q e h cindy c f j
id_lookup table
id attr1 attr2 attr3 1 a (null) g 2 (null) e h 3 c f (null)
final output
customer id jerry 1 tom 2 cindy 3
Note that jerry had a match on attr1 and attr3, so met the conditions, thus was assigned 1. The script moves on to the next customer to assign an id procedurally, starting at 1 and moving on in ascending order.
in sql, I would write
select a.customer , b.id from customers a join id_lookup b on ( a.attr1 = b.attr1 and a.attr2 = b.attr2 and a.attr3 = b.attr2 )
However, some attributes are not available and since a customer might match many ids, may not be assigned the right once, since the customer should be assigned the id with the lowest value that they qualify for.
Is there a script in sas that can accomplish this?
Perl Regular Expression.
data customer;
input customer $ attr1 $ attr2 $ attr3 $;
pid=catx(',',attr1,attr2,attr3);
cards;
jerry a r g
tom q e h
cindy c f j
;
run;
data lookup;
input id attr1 $ attr2 $ attr3 $;
pid='/^'||tranwrd(catx(',',attr1,attr2,attr3),'(null)','.*')||'$/';
cards;
1 a (null) g
2 (null) e h
3 c f (null)
;
run;
proc sql;
select b.*,a.customer,a.pid as a_pid
from customer as a right join lookup as b
on prxmatch(b.pid,strip(a.pid));
quit;
You can put that SQL (assuming it works) into a proc sql statement - SAS has ANSI SQL embedded in it:
proc sql; create table want as select a.customer , b.id from customers a join id_lookup b on ( a.attr1 = b.attr1 and a.attr2 = b.attr2 and a.attr3 = b.attr2 ); quit;
Now I think your problem is bigger than that. You say they can have many lookup values, what about overlap, i.e. Jerry and Tom both have E attribute? Generally speaking I would, and this would also help the SQL, normalise both datasets, so something like:
CUSTOMER ATTR_NO RESULT
001 1 a
001 2 r
...
You can then eradicate rows with nulls. Check for duplicates. Then take the first result which happens (i.e. the lowest), then merge the two together.
awesome. thanks! This totally worked!
Perl Regular Expression.
data customer;
input customer $ attr1 $ attr2 $ attr3 $;
pid=catx(',',attr1,attr2,attr3);
cards;
jerry a r g
tom q e h
cindy c f j
;
run;
data lookup;
input id attr1 $ attr2 $ attr3 $;
pid='/^'||tranwrd(catx(',',attr1,attr2,attr3),'(null)','.*')||'$/';
cards;
1 a (null) g
2 (null) e h
3 c f (null)
;
run;
proc sql;
select b.*,a.customer,a.pid as a_pid
from customer as a right join lookup as b
on prxmatch(b.pid,strip(a.pid));
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.