I am working with a very large data set where I am trying to take a value (such as lot number) from one column in table A, search for that value in a column in table B. If the search results in finding the table A value in table B then return the value "yes" or 'no" to a second column in table A. The tricky part is there in Table B there could be multiple values in one cell seperated by a space. Here is an example of what I have and what I want.
Table A
Product LotNumber
XYZ 1289
TAD 8943
KQS 5367
IND 2365
HON 9054
PMT 2893
Table B
PR# LotNumber
1234 8379
1235 8372 8943
1236 9054 2893
1237 5367
Table Want
Product LotNumber Match
XYZ 1289 No
TAD 8943 Yes
KQS 5367 Yes
IND 2365 No
HON 9054 Yes
PMT 2893 Yes
My actual data tables are about 10,000 rows long.
Thanks for any help and adive in advance,
Jeff
In SAS terms, 10000 observations is not a large dataset, i.e. it does not require special optimizations. Try this:
proc sql;
create table want as
select A.*,
case when exists(select * from B where lotNumber contains trim(A.lotNumber))
then "yes" else "no" end as match
from A;
select * from want;
quit;
PG
proc sql;
create table want as
select a.*,case when strip(b.lotnumber) contains strip(a.lotnumber) then 'Yes' else 'No' end as Match
from tablea a left join tableb b
on strip(b.lotnumber) contains strip(a.lotnumber);
quit;
Is there any chance that your lot number will be the same for multiple products?
And is there any relation between Product and PR#? Could it be that PR# is actually a coded value for Product?
Lot Number is a unique number for products and no two products can share the same lot number. PR# is just an investigation number. There could be multiple PR#s however that share the same lot number (2 different investigations on the same lot). I just need to know if a particular lot has a investigation at all or not.
In SAS terms, 10000 observations is not a large dataset, i.e. it does not require special optimizations. Try this:
proc sql;
create table want as
select A.*,
case when exists(select * from B where lotNumber contains trim(A.lotNumber))
then "yes" else "no" end as match
from A;
select * from want;
quit;
PG
Left join would be faster :
data TableA ; input Product $ LotNumber $ ; cards; XYZ 1289 TAD 8943 KQS 5367 IND 2365 HON 9054 PMT 2893 ; run; data TableB ; input PR LotNumber & $20.; cards; 1234 8379 1235 8372 8943 1236 9054 2893 1237 5367 ; run; proc sql; create table want as select a.*,case when missing(b.LotNumber) then 'No' else 'Yes' end as match from TableA as a left join TableB as b on b.LotNumber contains strip(a.LotNumber); quit;
Xia Keshan
Message was edited by: xia keshan
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.