<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Efficient way to pattern match strings in SAS in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Efficient-way-to-pattern-match-strings-in-SAS/m-p/268768#M53234</link>
    <description>&lt;P&gt;Ok, so found the answer why it didint work... For some unknown reason SAS adds trailing/leading blanks to the variables while executing the index fuction. So even if I&amp;nbsp; have strip lowcased the variables in their corresponding datasets, I will have to do it again while using the index (index(strip(lowcase(FVAR)),strip(lowcase(VAR)))) Don't know if this is somehow connected to the length set for the variables???&lt;/P&gt;</description>
    <pubDate>Fri, 06 May 2016 11:19:49 GMT</pubDate>
    <dc:creator>BobHope</dc:creator>
    <dc:date>2016-05-06T11:19:49Z</dc:date>
    <item>
      <title>Efficient way to pattern match strings in SAS</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Efficient-way-to-pattern-match-strings-in-SAS/m-p/268758#M53229</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I was wondering if there is any way to do efficient data filtering with pattern matching functions in SAS. What I mean by this is that I would give the function a variable (VAR) as an input parameter and the function would parse the data to see if any of the VAR values are contained in the destination variable string.&lt;/P&gt;
&lt;P&gt;To clarify this with an example:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I hava a dataset (MATCH) which I maintain and it includes the desired patterns.&lt;/P&gt;
&lt;P&gt;For example:&lt;/P&gt;
&lt;P&gt;VAR&lt;/P&gt;
&lt;P&gt;hi&lt;/P&gt;
&lt;P&gt;hello&lt;/P&gt;
&lt;P&gt;bye&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And then in the other dataset (HAVE) I have a variable which is used as a filter criteria lets say FVAR&lt;/P&gt;
&lt;P&gt;FVAR&lt;/P&gt;
&lt;P&gt;hi all!&lt;/P&gt;
&lt;P&gt;hello!&lt;/P&gt;
&lt;P&gt;good night&lt;/P&gt;
&lt;P&gt;good morning&lt;/P&gt;
&lt;P&gt;bye bye&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;and my WANT dataset would be&lt;/P&gt;
&lt;P&gt;FVAR&lt;/P&gt;
&lt;P&gt;hi all!&lt;/P&gt;
&lt;P&gt;hello!&lt;/P&gt;
&lt;P&gt;bye bye&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And just to point out my MATCH dataset has hundreds of rows so doing this my macro variables or hard coding is not an option.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 06 May 2016 09:16:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Efficient-way-to-pattern-match-strings-in-SAS/m-p/268758#M53229</guid>
      <dc:creator>BobHope</dc:creator>
      <dc:date>2016-05-06T09:16:07Z</dc:date>
    </item>
    <item>
      <title>Re: Efficient way to pattern match strings in SAS</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Efficient-way-to-pattern-match-strings-in-SAS/m-p/268759#M53230</link>
      <description>&lt;P&gt;There are many string comparison functions in SAS, you will find them listed here;&lt;/P&gt;
&lt;P&gt;&lt;A href="http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000245860.htm" target="_blank"&gt;http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000245860.htm&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You could also use Perl Regular expressions.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The simplest way to do what you are asking is like this:&lt;/P&gt;
&lt;PRE&gt;data match;
  length var $200;
  input var $;
datalines;
hi
hello
bye
;
run;
data have;
  length fvar $200;
  input FVAR $;
datalines;
hi all!
hello!
good night
good morning
bye bye
;
run;

proc sql;
  create table WANT as
  select  A.*
  from    HAVE A
  left join MATCH B
  on      index(A.FVAR,B.VAR) &amp;gt; 0
  where   B.VAR is not null;
quit;&lt;/PRE&gt;
&lt;P&gt;Do note, posting test data in the form of a datastep (as shown above) makes it a lot easier to get tested code back to you.&lt;/P&gt;</description>
      <pubDate>Fri, 06 May 2016 09:27:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Efficient-way-to-pattern-match-strings-in-SAS/m-p/268759#M53230</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2016-05-06T09:27:43Z</dc:date>
    </item>
    <item>
      <title>Re: Efficient way to pattern match strings in SAS</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Efficient-way-to-pattern-match-strings-in-SAS/m-p/268760#M53231</link>
      <description>&lt;P&gt;Yes thank you very much. I will post the whole code next time &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;Didn't even know you can use sql join like that!&lt;/P&gt;</description>
      <pubDate>Fri, 06 May 2016 09:32:51 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Efficient-way-to-pattern-match-strings-in-SAS/m-p/268760#M53231</guid>
      <dc:creator>BobHope</dc:creator>
      <dc:date>2016-05-06T09:32:51Z</dc:date>
    </item>
    <item>
      <title>Re: Efficient way to pattern match strings in SAS</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Efficient-way-to-pattern-match-strings-in-SAS/m-p/268761#M53232</link>
      <description>&lt;P&gt;One other option - and its more useful when its more complicated matching/processing, is to use the macth dataset to generate a dataset for the have e.g:&lt;/P&gt;
&lt;PRE&gt;data _null_;
  set match end=last;
  if _n_=1 then cal' execute('data want;  set have;');
  call execute(cat(' if index(fvar,',strip(var),')&amp;gt;0 then output;'));
  if last then call execute(' run;');
run;&lt;/PRE&gt;
&lt;P&gt;This will generate code of one if statement for each row of data in match, and that datastep will then run.&lt;/P&gt;</description>
      <pubDate>Fri, 06 May 2016 09:36:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Efficient-way-to-pattern-match-strings-in-SAS/m-p/268761#M53232</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2016-05-06T09:36:30Z</dc:date>
    </item>
    <item>
      <title>Re: Efficient way to pattern match strings in SAS</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Efficient-way-to-pattern-match-strings-in-SAS/m-p/268766#M53233</link>
      <description>&lt;P&gt;I tried your approach in practice (the first one, couldn't get the second one working even after correcting the call typo).&lt;/P&gt;
&lt;P&gt;I run some problems. Here is the sample code:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data MATCH;
   length VAR $30;
   input VAR $;
infile datalines dlm=',';
datalines;
testing.test,
Does work,
Special % char
;
run;

data HAVE;
length FVAR $300;
   input FVAR $;
   infile datalines dlm=',';
   datalines;
testing.test the test code,
Does work or not,
Does not work or what,
Special % character in a string,
This is not included,
The test of testing.test
TESTING.TEST
;
run;

/*
proc sql;
create table WANT as
select A.*
from HAVE A
left join MATCH B
on index(A.FVAR,B.VAR)&amp;gt;0
where B.VAR is not null;
quit;
*/

proc sql;
create table WANT as
select A.*
from HAVE A
left join MATCH B
on index(strip(lowcase(A.FVAR)),strip(lowcase(B.VAR)))&amp;gt;0
where B.VAR is not null;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;The result is not as intended. Do you have any idea why so?&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;EDIT: Ok there was a mistake in my example the old one is commented and edited one is live. However this does not solve my problem with the actual data... Don't know how I could find an example.&lt;/P&gt;</description>
      <pubDate>Fri, 06 May 2016 10:59:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Efficient-way-to-pattern-match-strings-in-SAS/m-p/268766#M53233</guid>
      <dc:creator>BobHope</dc:creator>
      <dc:date>2016-05-06T10:59:13Z</dc:date>
    </item>
    <item>
      <title>Re: Efficient way to pattern match strings in SAS</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Efficient-way-to-pattern-match-strings-in-SAS/m-p/268768#M53234</link>
      <description>&lt;P&gt;Ok, so found the answer why it didint work... For some unknown reason SAS adds trailing/leading blanks to the variables while executing the index fuction. So even if I&amp;nbsp; have strip lowcased the variables in their corresponding datasets, I will have to do it again while using the index (index(strip(lowcase(FVAR)),strip(lowcase(VAR)))) Don't know if this is somehow connected to the length set for the variables???&lt;/P&gt;</description>
      <pubDate>Fri, 06 May 2016 11:19:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Efficient-way-to-pattern-match-strings-in-SAS/m-p/268768#M53234</guid>
      <dc:creator>BobHope</dc:creator>
      <dc:date>2016-05-06T11:19:49Z</dc:date>
    </item>
    <item>
      <title>Re: Efficient way to pattern match strings in SAS</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Efficient-way-to-pattern-match-strings-in-SAS/m-p/268776#M53237</link>
      <description>&lt;P&gt;Well, if the length isn't stipulated then it will default to 8 and padd out. &amp;nbsp;I always strip() variables just to be safe.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For the other method, sorry there were a few typos in there, this should work:&lt;/P&gt;
&lt;PRE&gt;data MATCH;
   length VAR $30;
   input VAR $;
infile datalines dlm=',';
datalines;
testing.test,
Does work,
Special % char
;
run;

data HAVE;
length FVAR $300;
   input FVAR $;
   infile datalines dlm=',';
   datalines;
testing.test the test code,
Does work or not,
Does not work or what,
Special % character in a string,
This is not included,
The test of testing.test
TESTING.TEST
;
run;

data _null_;
  set match end=last;
  if _n_=1 then call execute('data want;  set have;');
  call execute(cat(' if index(fvar,"',strip(var),'")&amp;gt;0 then output;'));
  if last then call execute(' run;');
run;&lt;/PRE&gt;</description>
      <pubDate>Fri, 06 May 2016 12:11:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Efficient-way-to-pattern-match-strings-in-SAS/m-p/268776#M53237</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2016-05-06T12:11:05Z</dc:date>
    </item>
    <item>
      <title>Re: Efficient way to pattern match strings in SAS</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Efficient-way-to-pattern-match-strings-in-SAS/m-p/268966#M53281</link>
      <description>&lt;PRE&gt;
You are not doing some exact match, so check other distance function LIKE: spedis(), gendis(),complev() .......


data MATCH;
   length VAR $30;
   input VAR $;
infile datalines dlm=',';
datalines;
testing.test,
Does work,
Special % char
;
run;

data HAVE;
length FVAR $300;
   input FVAR $;
   infile datalines dlm=',';
   datalines;
testing.test the test code,
Does work or not,
Does not work or what,
Special % character in a string,
This is not included,
The test of testing.test
TESTING.TEST
;
run;

proc sql;
create table WANT as
select A.*,B.*
from HAVE A, MATCH B
group by B.VAR
having spedis(strip(lowcase(A.FVAR)),strip(lowcase(B.VAR)))=
  min(spedis(strip(lowcase(A.FVAR)),strip(lowcase(B.VAR))));
quit;




&lt;/PRE&gt;</description>
      <pubDate>Sat, 07 May 2016 04:04:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Efficient-way-to-pattern-match-strings-in-SAS/m-p/268966#M53281</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2016-05-07T04:04:53Z</dc:date>
    </item>
  </channel>
</rss>

