data words;
input dummy WORDS $;
cards;
1 AAA
2 BBB
3 CCC
4 DDD
5 EEE
;
data have;
input (COMPANY PROD1 PROD2 PROD3) ($);
cards;
C1 FAAAC FQQQ FPPP FZZZ
C2 FTTTW FUUU FPPP FLLL
C3 FOOOC FBBB FPPP FMMM
C4 FIIIIC FOOO FPPP FLLL
;
options noquotelenmax;
proc sql noprint;
select words into :words separated by '|' from words;
quit;
data want;
set have;
if prxmatch("/&words/",catx('|',of prod:));
run;
proc print;run;
data words;
input dummy WORDS $;
cards;
1 AAA
2 BBB
3 CCC
4 DDD
5 EEE
;
data have;
input (COMPANY PROD1 PROD2 PROD3) ($);
cards;
C1 FAAAC FQQQ FPPP FZZZ
C2 FTTTW FUUU FPPP FLLL
C3 FOOOC FBBB FPPP FMMM
C4 FIIIIC FOOO FPPP FLLL
;
options noquotelenmax;
proc sql noprint;
select words into :words separated by '|' from words;
quit;
data want;
set have;
if prxmatch("/&words/",catx('|',of prod:));
run;
proc print;run;
Your company-dataset seems to consist of four variables, but the subsequent lines have five values. So please post the data you have in usable form.
Similar to @Ksharp 's but no macro variable.
data WORDS;
input ITEM WORD $;
cards;
1 AAA
2 BBB
3 CCC
4 DDD
5 EEE
run;
data COMPANIES;
input COMPANY $ PROD1 $ PROD2 $ PROD3 $;
cards;
C1 FAAAC FQQQ FPPP FZZZ
C2 FTTTW FUUU FPPP FLLL
C3 FOOOC FBBB FPPP FMMM
C4 FIIIIC FOOO FPPP FLLL
run;
data WANT;
length WORDS $800;
retain WORDS PRX;
if _N_=1 then do;
do I= 1 by 1 until(LASTOBS);
set WORDS end=LASTOBS;
WORDS=catx('|',WORDS,WORD);
end;
PRX=prxparse(catt('/',WORDS,'/'));
end;
set COMPANIES;
if prxmatch(PRX,catx('|',of PROD1-PROD3));
drop I WORD: ;
run;
Hi @Kendrew yet another similar one alike others using FIND function
data words;
input dummy WORDS $;
cards;
1 AAA
2 BBB
3 CCC
4 DDD
5 EEE
;
data have;
input (COMPANY PROD1 PROD2 PROD3) ($);
cards;
C1 FAAAC FQQQ FPPP FZZZ
C2 FTTTW FUUU FPPP FLLL
C3 FOOOC FBBB FPPP FMMM
C4 FIIIIC FOOO FPPP FLLL
;
data want;
if _n_=1 then do;
if 0 then set words;
dcl hash H (dataset: "words") ;
h.definekey ("words") ;
h.definedone () ;
dcl hiter hi('h');
end;
set have;
length temp $500;
temp=catx(' ',of prod1-prod3);
do while(hi.next()=0);
if find(temp,strip(words)) then _n_=0;
end;
if _n_=0;
keep company prod1-prod3;
run;
| COMPANY | PROD1 | PROD2 | PROD3 |
|---|---|---|---|
| C1 | FAAAC | FQQQ | FPPP |
| C3 | FOOOC | FBBB | FPPP |
And yet another way for doing this creating a RegEx informat from your Words table.
/* source data */
data words;
input dummy WORDS $;
cards;
1 AAA
2 BBB
3 CCC
4 DDD
5 EEE
;
data have;
input (COMPANY PROD1 PROD2 PROD3) ($);
cards;
C1 FAAAC FQQQ FPPP FZZZ
C2 FTTTW FUUU FPPP FLLL
C3 FOOOC FBBB FPPP FMMM
C4 FIIIIC FOOO FPPP FLLL
;
/* create informat using regex */
data strings_fmt;
retain fmtname 'match_strings';
retain default 10;
retain type 'I';
retain HLO 'SP';
length start $20;
retain label 1;
set words(keep=words) end=last;
start=cats('/',words,'/i');
output;
if last then
do;
hlo='SO';
label=0;
output;
end;
run;
proc format cntlin=strings_fmt;
run;
/* select rows with matching strings */
data want;
set have;
array _prod prod1-prod3;
do over _prod;
/* string found if informat returns 1 */
if input(_prod,match_strings.) = 1 then
do;
output;
leave;
end;
end;
run;
proc print;
run;
data words;
input dummy WORDS $;
cards;
1 AAA
2 BBB
3 CCC
4 DDD
5 EEE
;
data have;
input (COMPANY PROD1 PROD2 PROD3) ($);
cards;
C1 FAAAC FQQQ FPPP FZZZ
C2 FTTTW FUUU FPPP FLLL
C3 FOOOC FBBB FPPP FMMM
C4 FIIIIC FOOO FPPP FLLL
;
proc sql;
create table want as
select a.*
from have a, words
where index(catx(prod1,prod2,prod3),strip(words))>0;
quit;
| COMPANY | PROD1 | PROD2 | PROD3 |
|---|---|---|---|
| C1 | FAAAC | FQQQ | FPPP |
| C3 | FOOOC | FBBB | FPPP |
Thank you sir @Patrick though catx without delimiter in a SQL is kinda tolerant to work like cats 🙂
But like you pointed better to be safe than sorry, and with a tweak to handle multiple matches(if present)
proc sql;
create table want as
select distinct a.*
from have a, words
group by COMPANY,PROD1, PROD2, PROD3
having max(index(catx('|',prod1,prod2,prod3),strip(words)));
quit;
@novinosrin wrote:
Thank you sir @Patrick though catx without delimiter in a SQL is kinda tolerant to work like cats 🙂
You sure about this? Then can you explain below result?
data words;
input dummy WORDS $;
cards;
1 AAA
2 BBB
3 CCC
4 DDD
5 EEE
;
data have;
input (COMPANY PROD1 PROD2 PROD3) ($);
cards;
C1 FAAAC FBBB FPPP FZZZ
C2 DTTTA AADD FPPP FLLL
C3 FOOOC FBBB FPPP FMMM
C4 FIIIIC FOOO FPPP FLLL
;
proc sql;
/* create table want as*/
select
a.*,
catx(prod1,prod2,prod3) as str1,
catx('|',prod1,prod2,prod3) as str2
from have a, words
group by COMPANY,PROD1, PROD2, PROD3
having max(index(catx(prod1,prod2,prod3),strip(words)));
quit;
Right I see. Sorry for my misconception/misunderstanding. I take back my words. My apologies.
I sit corrected. Thank you!
It was a nice one. Made me realize that I can use a variable as delimiter. Can't think of a use case right now - but who knows.
data test;
set sashelp.class;
string=catx(sex,age,height,weight);
run;
proc print data=test;
run;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.