Hi,
I'm attempting to troubleshoot some code and having no success. Ideally I would have this pull search terms from an excel table to be then used in a PRXMATCH statement which searches row by row for these terms, flagging them if they appear. This works well other than the fact that any modifiers or metacharacters entered in the excel document (which are then inserted into the PRXMATCH statement) are ignored for some reason.
Essentially I'd like to have a program that runs through datasets searching for words/phrases that are found in another document, but adding this intermediary seems to cause odd behavior.
The data I'm dealing with is product descriptions and so it would be useful to change entries in an excel file (say, searching for men's clothes would mean the column is populated with terms like "man" "men's" "boys") and have a variable created and instances where these terms appear be flagged. This instance in particular worked, but if I'm looking for terms such as "fire-proof" I would normally use regex along the lines of "fire\sproof". This does not work, however and these instances aren't flagged when they should be.
%MACRO TEMP;
%DO i = 1 %TO &MAX_VARNUM.;
data rep1; set out.clean_list (keep = &&V_&i. where= (&&V_&i. ne "")); run;
proc sql;
select &&V_&i.
INTO :&&V_&i. SEPARATED BY '|'
from rep1
;quit;
proc sql; create table macros&i. as select * from dictionary.macros; quit;
%END;
DATA cln (KEEP = n_id all_txt ); set &out_lib..complete; n_id = _n_; run;
%DO i = 1 %TO &MAX_VARNUM.;
%LET VN_&i. = %SYSFUNC(TRANSLATE(&&V_&i., "__", " -"));
DATA cln_&i. ; SET cln;
IF PRXMATCH("m/(&&V_&i.)/oi", all_txt) then
DO;
name_&&VN_&i. = "&&V_&i.";
&&VN_&i. = 1;
END;
RUN;
%END;
DATA fnl.clnw_&inpdet._&date.;
MERGE out.complete cln_:;
BY n_id;
RUN;
Probably not your exact need, but since you did not supply examples of you data structure, this will do.
No macros needed, this matches the values you describe:
data CLEAN_LIST;
VARA='mens ' ; VARB='fire-proof'; output;
VARA='boys ' ; VARB='man '; output;
run;
data HAVE;
ALL_TXT='fireproof vest'; output;
ALL_TXT="boy's vest"; output;
run;
data MATCH;
set HAVE;
if 0 then set CLEAN_LIST nobs=NOBS;
do VAR='VARA', 'VARB';
do OBSNO=1 to NOBS;
set CLEAN_LIST point=OBSNO;
VAL1 =vvaluex(VAR);
VAL2 =prxchange('s/[^A-Z]/[^A-Z]?/i', -1, trim(VAL1));
FLAG=prxmatch(catt('m/',VAL2,'/i'), ALL_TXT);
if FLAG then output;
end;
end;
keep ALL_TXT VAL1;
run;
ALL_TXT | VAL1 |
---|---|
fireproof vest | fire-proof |
I don't understand why you are taking the values out of the dataset into macro variables.
You can generate the string with the regular expression using normal code.
Please provide example input data set(s) and expected output data.
Also explain (in words) the algorithm you think will be able to produce that output.
Probably not your exact need, but since you did not supply examples of you data structure, this will do.
No macros needed, this matches the values you describe:
data CLEAN_LIST;
VARA='mens ' ; VARB='fire-proof'; output;
VARA='boys ' ; VARB='man '; output;
run;
data HAVE;
ALL_TXT='fireproof vest'; output;
ALL_TXT="boy's vest"; output;
run;
data MATCH;
set HAVE;
if 0 then set CLEAN_LIST nobs=NOBS;
do VAR='VARA', 'VARB';
do OBSNO=1 to NOBS;
set CLEAN_LIST point=OBSNO;
VAL1 =vvaluex(VAR);
VAL2 =prxchange('s/[^A-Z]/[^A-Z]?/i', -1, trim(VAL1));
FLAG=prxmatch(catt('m/',VAL2,'/i'), ALL_TXT);
if FLAG then output;
end;
end;
keep ALL_TXT VAL1;
run;
ALL_TXT | VAL1 |
---|---|
fireproof vest | fire-proof |
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.