data words;
input dummy WORDS $;
cards;
1 AAA
2 BBB
3 CCC
4 DDD
5 EEE
;
data have;
input (COMPANY PROD1 PROD2 PROD3) ($);
cards;
C1 FAAAC FQQQ FPPP FZZZ
C2 FTTTW FUUU FPPP FLLL
C3 FOOOC FBBB FPPP FMMM
C4 FIIIIC FOOO FPPP FLLL
;
options noquotelenmax;
proc sql noprint;
select words into :words separated by '|' from words;
quit;
data want;
set have;
if prxmatch("/&words/",catx('|',of prod:));
run;
proc print;run;
data words;
input dummy WORDS $;
cards;
1 AAA
2 BBB
3 CCC
4 DDD
5 EEE
;
data have;
input (COMPANY PROD1 PROD2 PROD3) ($);
cards;
C1 FAAAC FQQQ FPPP FZZZ
C2 FTTTW FUUU FPPP FLLL
C3 FOOOC FBBB FPPP FMMM
C4 FIIIIC FOOO FPPP FLLL
;
options noquotelenmax;
proc sql noprint;
select words into :words separated by '|' from words;
quit;
data want;
set have;
if prxmatch("/&words/",catx('|',of prod:));
run;
proc print;run;
Your company-dataset seems to consist of four variables, but the subsequent lines have five values. So please post the data you have in usable form.
Similar to @Ksharp 's but no macro variable.
data WORDS;
input ITEM WORD $;
cards;
1 AAA
2 BBB
3 CCC
4 DDD
5 EEE
run;
data COMPANIES;
input COMPANY $ PROD1 $ PROD2 $ PROD3 $;
cards;
C1 FAAAC FQQQ FPPP FZZZ
C2 FTTTW FUUU FPPP FLLL
C3 FOOOC FBBB FPPP FMMM
C4 FIIIIC FOOO FPPP FLLL
run;
data WANT;
length WORDS $800;
retain WORDS PRX;
if _N_=1 then do;
do I= 1 by 1 until(LASTOBS);
set WORDS end=LASTOBS;
WORDS=catx('|',WORDS,WORD);
end;
PRX=prxparse(catt('/',WORDS,'/'));
end;
set COMPANIES;
if prxmatch(PRX,catx('|',of PROD1-PROD3));
drop I WORD: ;
run;
Hi @Kendrew yet another similar one alike others using FIND function
data words;
input dummy WORDS $;
cards;
1 AAA
2 BBB
3 CCC
4 DDD
5 EEE
;
data have;
input (COMPANY PROD1 PROD2 PROD3) ($);
cards;
C1 FAAAC FQQQ FPPP FZZZ
C2 FTTTW FUUU FPPP FLLL
C3 FOOOC FBBB FPPP FMMM
C4 FIIIIC FOOO FPPP FLLL
;
data want;
if _n_=1 then do;
if 0 then set words;
dcl hash H (dataset: "words") ;
h.definekey ("words") ;
h.definedone () ;
dcl hiter hi('h');
end;
set have;
length temp $500;
temp=catx(' ',of prod1-prod3);
do while(hi.next()=0);
if find(temp,strip(words)) then _n_=0;
end;
if _n_=0;
keep company prod1-prod3;
run;
COMPANY | PROD1 | PROD2 | PROD3 |
---|---|---|---|
C1 | FAAAC | FQQQ | FPPP |
C3 | FOOOC | FBBB | FPPP |
And yet another way for doing this creating a RegEx informat from your Words table.
/* source data */
data words;
input dummy WORDS $;
cards;
1 AAA
2 BBB
3 CCC
4 DDD
5 EEE
;
data have;
input (COMPANY PROD1 PROD2 PROD3) ($);
cards;
C1 FAAAC FQQQ FPPP FZZZ
C2 FTTTW FUUU FPPP FLLL
C3 FOOOC FBBB FPPP FMMM
C4 FIIIIC FOOO FPPP FLLL
;
/* create informat using regex */
data strings_fmt;
retain fmtname 'match_strings';
retain default 10;
retain type 'I';
retain HLO 'SP';
length start $20;
retain label 1;
set words(keep=words) end=last;
start=cats('/',words,'/i');
output;
if last then
do;
hlo='SO';
label=0;
output;
end;
run;
proc format cntlin=strings_fmt;
run;
/* select rows with matching strings */
data want;
set have;
array _prod prod1-prod3;
do over _prod;
/* string found if informat returns 1 */
if input(_prod,match_strings.) = 1 then
do;
output;
leave;
end;
end;
run;
proc print;
run;
data words;
input dummy WORDS $;
cards;
1 AAA
2 BBB
3 CCC
4 DDD
5 EEE
;
data have;
input (COMPANY PROD1 PROD2 PROD3) ($);
cards;
C1 FAAAC FQQQ FPPP FZZZ
C2 FTTTW FUUU FPPP FLLL
C3 FOOOC FBBB FPPP FMMM
C4 FIIIIC FOOO FPPP FLLL
;
proc sql;
create table want as
select a.*
from have a, words
where index(catx(prod1,prod2,prod3),strip(words))>0;
quit;
COMPANY | PROD1 | PROD2 | PROD3 |
---|---|---|---|
C1 | FAAAC | FQQQ | FPPP |
C3 | FOOOC | FBBB | FPPP |
Thank you sir @Patrick though catx without delimiter in a SQL is kinda tolerant to work like cats 🙂
But like you pointed better to be safe than sorry, and with a tweak to handle multiple matches(if present)
proc sql;
create table want as
select distinct a.*
from have a, words
group by COMPANY,PROD1, PROD2, PROD3
having max(index(catx('|',prod1,prod2,prod3),strip(words)));
quit;
@novinosrin wrote:
Thank you sir @Patrick though catx without delimiter in a SQL is kinda tolerant to work like cats 🙂
You sure about this? Then can you explain below result?
data words;
input dummy WORDS $;
cards;
1 AAA
2 BBB
3 CCC
4 DDD
5 EEE
;
data have;
input (COMPANY PROD1 PROD2 PROD3) ($);
cards;
C1 FAAAC FBBB FPPP FZZZ
C2 DTTTA AADD FPPP FLLL
C3 FOOOC FBBB FPPP FMMM
C4 FIIIIC FOOO FPPP FLLL
;
proc sql;
/* create table want as*/
select
a.*,
catx(prod1,prod2,prod3) as str1,
catx('|',prod1,prod2,prod3) as str2
from have a, words
group by COMPANY,PROD1, PROD2, PROD3
having max(index(catx(prod1,prod2,prod3),strip(words)));
quit;
Right I see. Sorry for my misconception/misunderstanding. I take back my words. My apologies.
I sit corrected. Thank you!
It was a nice one. Made me realize that I can use a variable as delimiter. Can't think of a use case right now - but who knows.
data test;
set sashelp.class;
string=catx(sex,age,height,weight);
run;
proc print data=test;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.