- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I'm attempting to troubleshoot some code and having no success. Ideally I would have this pull search terms from an excel table to be then used in a PRXMATCH statement which searches row by row for these terms, flagging them if they appear. This works well other than the fact that any modifiers or metacharacters entered in the excel document (which are then inserted into the PRXMATCH statement) are ignored for some reason.
Essentially I'd like to have a program that runs through datasets searching for words/phrases that are found in another document, but adding this intermediary seems to cause odd behavior.
The data I'm dealing with is product descriptions and so it would be useful to change entries in an excel file (say, searching for men's clothes would mean the column is populated with terms like "man" "men's" "boys") and have a variable created and instances where these terms appear be flagged. This instance in particular worked, but if I'm looking for terms such as "fire-proof" I would normally use regex along the lines of "fire\sproof". This does not work, however and these instances aren't flagged when they should be.
%MACRO TEMP;
%DO i = 1 %TO &MAX_VARNUM.;
data rep1; set out.clean_list (keep = &&V_&i. where= (&&V_&i. ne "")); run;
proc sql;
select &&V_&i.
INTO :&&V_&i. SEPARATED BY '|'
from rep1
;quit;
proc sql; create table macros&i. as select * from dictionary.macros; quit;
%END;
DATA cln (KEEP = n_id all_txt ); set &out_lib..complete; n_id = _n_; run;
%DO i = 1 %TO &MAX_VARNUM.;
%LET VN_&i. = %SYSFUNC(TRANSLATE(&&V_&i., "__", " -"));
DATA cln_&i. ; SET cln;
IF PRXMATCH("m/(&&V_&i.)/oi", all_txt) then
DO;
name_&&VN_&i. = "&&V_&i.";
&&VN_&i. = 1;
END;
RUN;
%END;
DATA fnl.clnw_&inpdet._&date.;
MERGE out.complete cln_:;
BY n_id;
RUN;
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Probably not your exact need, but since you did not supply examples of you data structure, this will do.
No macros needed, this matches the values you describe:
data CLEAN_LIST;
VARA='mens ' ; VARB='fire-proof'; output;
VARA='boys ' ; VARB='man '; output;
run;
data HAVE;
ALL_TXT='fireproof vest'; output;
ALL_TXT="boy's vest"; output;
run;
data MATCH;
set HAVE;
if 0 then set CLEAN_LIST nobs=NOBS;
do VAR='VARA', 'VARB';
do OBSNO=1 to NOBS;
set CLEAN_LIST point=OBSNO;
VAL1 =vvaluex(VAR);
VAL2 =prxchange('s/[^A-Z]/[^A-Z]?/i', -1, trim(VAL1));
FLAG=prxmatch(catt('m/',VAL2,'/i'), ALL_TXT);
if FLAG then output;
end;
end;
keep ALL_TXT VAL1;
run;
ALL_TXT | VAL1 |
---|---|
fireproof vest | fire-proof |
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I don't understand why you are taking the values out of the dataset into macro variables.
You can generate the string with the regular expression using normal code.
Please provide example input data set(s) and expected output data.
Also explain (in words) the algorithm you think will be able to produce that output.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Probably not your exact need, but since you did not supply examples of you data structure, this will do.
No macros needed, this matches the values you describe:
data CLEAN_LIST;
VARA='mens ' ; VARB='fire-proof'; output;
VARA='boys ' ; VARB='man '; output;
run;
data HAVE;
ALL_TXT='fireproof vest'; output;
ALL_TXT="boy's vest"; output;
run;
data MATCH;
set HAVE;
if 0 then set CLEAN_LIST nobs=NOBS;
do VAR='VARA', 'VARB';
do OBSNO=1 to NOBS;
set CLEAN_LIST point=OBSNO;
VAL1 =vvaluex(VAR);
VAL2 =prxchange('s/[^A-Z]/[^A-Z]?/i', -1, trim(VAL1));
FLAG=prxmatch(catt('m/',VAL2,'/i'), ALL_TXT);
if FLAG then output;
end;
end;
keep ALL_TXT VAL1;
run;
ALL_TXT | VAL1 |
---|---|
fireproof vest | fire-proof |