<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Flagging certain words in a string using macro variable list in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Flagging-certain-words-in-a-string-using-macro-variable-list/m-p/880409#M347868</link>
    <description>&lt;P&gt;The whole data step is as simple as you wrote it which is why it was not included. Here are some examples of what var1, var2, and var3 might look like.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=""&gt;data test;
var1='I live in Alabama';
var2='My blood pressure is high';
var3='I went to my doctor today';
output;

data want;
set test;

if findw("&amp;amp;statenames",var1,'|','i')&amp;gt;=1 then state_flag=1;
else if findw("&amp;amp;statenames",var2,'|','i')&amp;gt;=1 then state_flag=1;
else if findw("&amp;amp;statenames",var3,'|','i')&amp;gt;=1 then state_flag=1;
else state_flag=0;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;The code should flag this record because it contains 'Alabama' but it does not as currently written.&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Tue, 13 Jun 2023 13:24:17 GMT</pubDate>
    <dc:creator>martyvd</dc:creator>
    <dc:date>2023-06-13T13:24:17Z</dc:date>
    <item>
      <title>Flagging certain words in a string using macro variable list</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Flagging-certain-words-in-a-string-using-macro-variable-list/m-p/880304#M347819</link>
      <description>&lt;P&gt;I am trying to flag certain words in 3 different string variables using a macro variable list. For example, I created a macro variable containing a list of US state names.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=""&gt;proc sql noprint;                                                                                                                       
  select staten
  into :statenames separated by '|'                                                                                                     
  from sashelp.tgrmaps;                                                                                                                            
quit;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;First I tried to use the findw function but it did not flag any records. The code works with a single word as the argument so I suspect it may be searching for the entire list rather than the individual names. I wasn't able to work out how to fix that.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=""&gt;if %sysfunc(findw(var1,&amp;amp;statenames, ,i))&amp;gt;=1 then state_flag=1;
else if %sysfunc(findw(var2,&amp;amp;statenames, ,i))&amp;gt;=1 then state_flag=1;
else if %sysfunc(findw(var3,&amp;amp;statenames, ,i))&amp;gt;=1 then state_flag=1;
else state_flag=0;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I then tried to use pattern matching as shown below. However my dataset is very large (~7 million) records and it seems this method is too slow and does not run even after 2+ hours.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=""&gt;if prxmatch("/&amp;amp;statenames/i",var1)&amp;gt;0 then state_flag=1;
else if prxmatch("/&amp;amp;statenames/i",var2)&amp;gt;0 then state_flag=1;
else if prxmatch("/&amp;amp;statenames/i",var3)&amp;gt;0 then state_flag=1;
else state_flag=0;&lt;/CODE&gt;&amp;nbsp;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 12 Jun 2023 21:15:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Flagging-certain-words-in-a-string-using-macro-variable-list/m-p/880304#M347819</guid>
      <dc:creator>martyvd</dc:creator>
      <dc:date>2023-06-12T21:15:55Z</dc:date>
    </item>
    <item>
      <title>Re: Flagging certain words in a string using macro variable list</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Flagging-certain-words-in-a-string-using-macro-variable-list/m-p/880305#M347820</link>
      <description>&lt;P&gt;To tell you the truth, showing us portions of the data step, and not the whole data step, hinders our ability to answer your questions. In addition, not showing us portions of the data (specifically, typical contents of var1, var2 and var3) also hinders our ability to answer your questions. And I don't seem to have sashelp.tgrmaps in SAS OnDemand for Academics.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Nevertheless, this may work (depending on your data)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql noprint;
  select distinct statecode
  into :statecodes separated by '|'
  from sashelp.zipcode;
quit;
%put &amp;amp;=statecodes;

data have;
    var1='CT';
    var2='BU';
    output;
    var1='BU';
    var2='AG';
    output;
run;
data want;
    set have;
    if findw("&amp;amp;statecodes",var1,'|','i')&amp;gt;=1 then state_flag=1;
	else if findw("&amp;amp;statecodes",var2,'|','i')&amp;gt;=1 then state_flag=1;
	else state_flag=0;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If it doesn't work, then provide the information requested above.&lt;/P&gt;</description>
      <pubDate>Mon, 12 Jun 2023 21:48:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Flagging-certain-words-in-a-string-using-macro-variable-list/m-p/880305#M347820</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2023-06-12T21:48:53Z</dc:date>
    </item>
    <item>
      <title>Re: Flagging certain words in a string using macro variable list</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Flagging-certain-words-in-a-string-using-macro-variable-list/m-p/880306#M347821</link>
      <description>Why are you using %SYSFUNC in an IF/ELSE without macro logic? I still don't think FINDW will work but that code is confusing.</description>
      <pubDate>Mon, 12 Jun 2023 21:34:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Flagging-certain-words-in-a-string-using-macro-variable-list/m-p/880306#M347821</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2023-06-12T21:34:44Z</dc:date>
    </item>
    <item>
      <title>Re: Flagging certain words in a string using macro variable list</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Flagging-certain-words-in-a-string-using-macro-variable-list/m-p/880311#M347825</link>
      <description>&lt;P&gt;If a temporary array approach is an option try this approach instead. Because temporary arrays are loaded into memory this may be faster.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://gist.github.com/statgeek/2f733d27820f43fa37d6ba92c30f22cf" target="_blank"&gt;https://gist.github.com/statgeek/2f733d27820f43fa37d6ba92c30f22cf&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 12 Jun 2023 22:22:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Flagging-certain-words-in-a-string-using-macro-variable-list/m-p/880311#M347825</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2023-06-12T22:22:20Z</dc:date>
    </item>
    <item>
      <title>Re: Flagging certain words in a string using macro variable list</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Flagging-certain-words-in-a-string-using-macro-variable-list/m-p/880323#M347829</link>
      <description>&lt;P&gt;SASHELP.TGRMAPS seems to be installed only with SAS/GIS, which is sort of deprecated for most users.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;First thing, have you LOOKED at your Statenames macro variable? You may have many repeats as the description I find for that data set involves FIPS codes and states. So it is quite possible that a state appears many times in the variable.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You might try&lt;/P&gt;
&lt;PRE&gt;proc sql noprint;                                                                                                                       
  select Distinct staten
  into :statenames separated by '|'                                                                                                     
  from sashelp.tgrmaps;                                                                                                                            
quit;&lt;/PRE&gt;
&lt;P&gt;to reduce the size of the macro variable and see if that speeds up the solution you have that seems to "work" with Prxmatch.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Is this really looking through random text to see if one (or more) state names appears in the text?&lt;/P&gt;
&lt;P&gt;You may have to take another pass or two through the data if that is the case as state names may appear in compound words in places that the value is not actually a state. Just a local example: in Boise, Idaho we have a "New York Canal". Is that intended to be a match for your process? Other place names like mountain, rivers and streams, roads, restaurants to start a list have states as part of their names.&lt;/P&gt;</description>
      <pubDate>Tue, 13 Jun 2023 00:03:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Flagging-certain-words-in-a-string-using-macro-variable-list/m-p/880323#M347829</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2023-06-13T00:03:00Z</dc:date>
    </item>
    <item>
      <title>Re: Flagging certain words in a string using macro variable list</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Flagging-certain-words-in-a-string-using-macro-variable-list/m-p/880409#M347868</link>
      <description>&lt;P&gt;The whole data step is as simple as you wrote it which is why it was not included. Here are some examples of what var1, var2, and var3 might look like.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=""&gt;data test;
var1='I live in Alabama';
var2='My blood pressure is high';
var3='I went to my doctor today';
output;

data want;
set test;

if findw("&amp;amp;statenames",var1,'|','i')&amp;gt;=1 then state_flag=1;
else if findw("&amp;amp;statenames",var2,'|','i')&amp;gt;=1 then state_flag=1;
else if findw("&amp;amp;statenames",var3,'|','i')&amp;gt;=1 then state_flag=1;
else state_flag=0;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;The code should flag this record because it contains 'Alabama' but it does not as currently written.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 13 Jun 2023 13:24:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Flagging-certain-words-in-a-string-using-macro-variable-list/m-p/880409#M347868</guid>
      <dc:creator>martyvd</dc:creator>
      <dc:date>2023-06-13T13:24:17Z</dc:date>
    </item>
    <item>
      <title>Re: Flagging certain words in a string using macro variable list</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Flagging-certain-words-in-a-string-using-macro-variable-list/m-p/880410#M347869</link>
      <description>&lt;P&gt;I confirmed that the statenames macro variable contains only the names of 50 states + DC with no duplicates.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Yes, I am really trying to look through text to see if one or more state names appear. My objective is to identify records where individuals may have entered personal information such as their location. I am not concerned about where state names may appear in compound words. Flagging those may actually be helpful.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 13 Jun 2023 13:29:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Flagging-certain-words-in-a-string-using-macro-variable-list/m-p/880410#M347869</guid>
      <dc:creator>martyvd</dc:creator>
      <dc:date>2023-06-13T13:29:35Z</dc:date>
    </item>
    <item>
      <title>Re: Flagging certain words in a string using macro variable list</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Flagging-certain-words-in-a-string-using-macro-variable-list/m-p/880420#M347870</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/228746"&gt;@martyvd&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;The whole data step is as simple as you wrote it which is why it was not included. Here are some examples of what var1, var2, and var3 might look like.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=""&gt;data test;
var1='I live in Alabama';
var2='My blood pressure is high';
var3='I went to my doctor today';
output;

data want;
set test;

if findw("&amp;amp;statenames",var1,'|','i')&amp;gt;=1 then state_flag=1;
else if findw("&amp;amp;statenames",var2,'|','i')&amp;gt;=1 then state_flag=1;
else if findw("&amp;amp;statenames",var3,'|','i')&amp;gt;=1 then state_flag=1;
else state_flag=0;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;The code should flag this record because it contains 'Alabama' but it does not as currently written.&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;No it should not find "Alabama" as you are searching for the value &lt;CODE class=""&gt;&lt;FONT color="#000000"&gt;'I live in Alabama'.&lt;/FONT&gt;&lt;/CODE&gt;&lt;/P&gt;
&lt;P&gt;If you want to find Alabama then you have to parse out each word in the value of Var1 and search for that individually.&lt;/P&gt;
&lt;P&gt;Easy enough to demonstrate:&lt;/P&gt;
&lt;PRE&gt;data test;
var1='I live in Alabama';
var2='My blood pressure is high';
var3='I went to my doctor today';
output;

data want;
set test;

if findw("Alabama Alaska",var1,'|','i')&amp;gt;=1 then state_flag=1;
run;&lt;/PRE&gt;
&lt;P&gt;&lt;CODE class=""&gt;&lt;/CODE&gt;&lt;/P&gt;
&lt;P&gt;So try something more like:&lt;/P&gt;
&lt;PRE&gt;data test;
var1='I live in Alabama';
var2='My blood pressure is high';
var3='I went to my doctor today';
output;

data want;
set test;
length word $ 25;
do i=1 to countw(var1);
   word=scan(var1,i);
   if findw("Alabama|Alaska",strip(word),'|','i')&amp;gt;=1 then do;
      state_flag=1;
      leave;
   end;
end;
run;&lt;/PRE&gt;
&lt;P&gt;Need to make sure that the temporary variable Word is defined to be long enough to contain likely "words" in your phrases. You will need Strip in the Findw because otherwise&lt;FONT color="#000000"&gt;&lt;FONT color="#000000"&gt; the variable Word will be padded to its length with blanks and likely not found in the phrase.&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#000000"&gt;The Leave instruction will terminate the loop over the words in Var1 as soon as the first match is found.&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#000000"&gt;You would want to add two more loops for var2 and var3 OR concatenate Var1, var2 and Var3 to a single phrase to use in the loop to search.&lt;/FONT&gt;&lt;/P&gt;
&lt;PRE&gt;data test;
var1='I went to my doctor today';
var2='My blood pressure is high';
var3='I live in Alabama and am likely to stay there';
output;

data want;
set test;
length word $ 25;
searchstr = catx(' ', var1,var2,var3);&lt;BR /&gt;state_flag=0;
do i=1 to countw(searchstr);
   word=scan(searchstr,i);
   if findw("Alabama|Alaska",strip(word),'|','i')&amp;gt;=1 then do;
      state_flag=1;
      leave;
   end;
end;
run;&lt;/PRE&gt;
&lt;P&gt;&lt;FONT color="#000000"&gt;You would drop the i, searchstr and likely Word temporary variables though leaving Word in tells you which match was used to set the flag. Using the Searchstr variable likely need to provide a Length value for that as well.&lt;BR /&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#000000"&gt;&lt;CODE class=""&gt;&lt;/CODE&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#000000"&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 13 Jun 2023 14:29:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Flagging-certain-words-in-a-string-using-macro-variable-list/m-p/880420#M347870</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2023-06-13T14:29:26Z</dc:date>
    </item>
    <item>
      <title>Re: Flagging certain words in a string using macro variable list</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Flagging-certain-words-in-a-string-using-macro-variable-list/m-p/880434#M347875</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data test;
    length var1 var2 var3 $100.;
    var1='I live in Alabama';
    var2='My blood pressure is high';
    var3='I went to my doctor today';
    output;
    var1='I live in Canada';
    var2='My blood pressure is high';
    var3='I went to my doctor today';
    output;
    var1='I live in Canada';
    var2='My blood pressure is high in Montana';
    var3='I went to my doctor today';
    output;
run;

proc sql;
    create table us_states as select statename from sashelp.us_data;
quit;

%let num_search_terms = &amp;amp;sqlobs.;
%put &amp;amp;num_search_terms;

data flagged;
    *declare array;
    array _search(&amp;amp;num_search_terms.) $100. _temporary_;

    /*2*/
    *load array into memory;

    if _n_=1 then
        do j=1 to &amp;amp;num_search_terms.;
            set us_states;
            _search(j)=statename;
        end;
    set test;
    array _var(3) var1-var3;
    *set flag to 0 for initial start;
    state_flag=0;

    /*3*/
    *loop through and craete flag;

    do i=1 to &amp;amp;num_search_terms. while(state_flag=0);

        /*4*/
        do k=1 to 3 while(state_flag=0);

            if find(_var(k), _search(i), 'it')&amp;gt;0 then
                state_flag=1;
        end;
    end;
    drop i j k;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 13 Jun 2023 15:11:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Flagging-certain-words-in-a-string-using-macro-variable-list/m-p/880434#M347875</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2023-06-13T15:11:05Z</dc:date>
    </item>
    <item>
      <title>Re: Flagging certain words in a string using macro variable list</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Flagging-certain-words-in-a-string-using-macro-variable-list/m-p/880460#M347885</link>
      <description>Thank you, this seems to be working.</description>
      <pubDate>Tue, 13 Jun 2023 15:57:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Flagging-certain-words-in-a-string-using-macro-variable-list/m-p/880460#M347885</guid>
      <dc:creator>martyvd</dc:creator>
      <dc:date>2023-06-13T15:57:38Z</dc:date>
    </item>
    <item>
      <title>Re: Flagging certain words in a string using macro variable list</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Flagging-certain-words-in-a-string-using-macro-variable-list/m-p/880465#M347887</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/228746"&gt;@martyvd&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;I confirmed that the statenames macro variable contains only the names of 50 states + DC with no duplicates.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Yes, I am really trying to look through text to see if one or more state names appear. My objective is to identify records where individuals may have entered personal information such as their location. I am not concerned about where state names may appear in compound words. Flagging those may actually be helpful.&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;If you want to directly search if ANY of a list of values appears in free text then a regular expression is probably going to be the shortest code.&amp;nbsp; The pipe character is used in a regular expression to list a set of alternative possible matches.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Example:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
  infile cards dsd truncover ;
  input (var1 var2) (:$30.) ;
cards;
I live in Alabama,
,My blood pressure is high
I live in Kansas,My blood pressure is low
;

data want ;
  set have;
  found=0&amp;lt;prxmatch('/\b(alabama|alaska|arkansas)\b/i',catx(' ',of var1-var2));
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Result&lt;/P&gt;
&lt;PRE&gt;OBS          var1                     var2               found

 1     I live in Alabama                                   1
 2                          My blood pressure is high      0
 3     I live in Kansas     My blood pressure is low       0
&lt;/PRE&gt;
&lt;P&gt;You could have the list of states in a macro variable:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;%let states=alabama|alaska|arkansas;
data want ;
  set have;
  found=0&amp;lt;prxmatch("/\b(&amp;amp;states)\b/i",catx(' ',of var1-var2));
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 13 Jun 2023 16:11:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Flagging-certain-words-in-a-string-using-macro-variable-list/m-p/880465#M347887</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2023-06-13T16:11:37Z</dc:date>
    </item>
  </channel>
</rss>

