<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to find strings in a dataset? in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/How-to-find-strings-in-a-dataset/m-p/279002#M56203</link>
    <description>&lt;P&gt;Upcase converts everything to capitals. This means you can to a single comparison rather than worry about testing for different cases.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Tue, 21 Jun 2016 12:40:13 GMT</pubDate>
    <dc:creator>Reeza</dc:creator>
    <dc:date>2016-06-21T12:40:13Z</dc:date>
    <item>
      <title>How to find strings in a dataset?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-find-strings-in-a-dataset/m-p/278814#M56121</link>
      <description>&lt;P&gt;Hi Everyone,&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;New to sas and have no idea what I'm doing!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I need to flag a number of obsevations in my data set.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;For example, I want to flag all observations that have a description that includes "tx" or "treatment"- I want to find any iteration of these strings (i.e., whether it's CAPS or some letters are caps or others aren't).&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I also want to EXCLUDE the following strings= "supplies-treatment", "Supplies-treamtent" and "Acute/Subacute Treatment - Discharge".&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This is the code I have used, but I'm not sure if I'm doing this right at all:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;data flagtx;&lt;BR /&gt;set flagtxdc;&lt;BR /&gt;if prxmatch("m/tx|treatment|treat/oi",description) &amp;gt; 0 then tx=1;&lt;BR /&gt;else tx=0;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is this right and how do I EXCLUDE certain strings.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;(If you answer, I would really appreciate it if you could dumb it down! I'm very new to SAS and don't really understand a lot of the terminiology, lol)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;thanks!&lt;/P&gt;</description>
      <pubDate>Mon, 20 Jun 2016 19:55:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-find-strings-in-a-dataset/m-p/278814#M56121</guid>
      <dc:creator>christinagting0</dc:creator>
      <dc:date>2016-06-20T19:55:24Z</dc:date>
    </item>
    <item>
      <title>Re: How to find strings in a dataset?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-find-strings-in-a-dataset/m-p/278817#M56122</link>
      <description>&lt;P&gt;What do you mean by exclude - remove those strings or remove observation?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Can you post sample data and expected output?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;if your a SAS beginner and not familiar with perl I would recommend the find/finds/index/indexes functions along with upcase/lowcase.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Upcase allows you capitalize all your text so you can simplify comparisons.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Find/index search a string for a specified substring or word.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 20 Jun 2016 19:59:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-find-strings-in-a-dataset/m-p/278817#M56122</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2016-06-20T19:59:27Z</dc:date>
    </item>
    <item>
      <title>Re: How to find strings in a dataset?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-find-strings-in-a-dataset/m-p/278823#M56124</link>
      <description>&lt;P&gt;Hi Reeza,&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks for replying.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Here is some example data&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;ID &amp;nbsp; Description&lt;/P&gt;&lt;P&gt;1 &amp;nbsp; &amp;nbsp; chronic treatment&lt;/P&gt;&lt;P&gt;2 &amp;nbsp; &amp;nbsp; Treatment&lt;/P&gt;&lt;P&gt;3 &amp;nbsp; &amp;nbsp; acute/subacute tx&lt;/P&gt;&lt;P&gt;4 &amp;nbsp; &amp;nbsp; supplies-treatment&lt;/P&gt;&lt;P&gt;5 &amp;nbsp; &amp;nbsp; Supplies - Treament&lt;/P&gt;&lt;P&gt;6 &amp;nbsp; &amp;nbsp; apples&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My expected output it&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;ID &amp;nbsp; Description &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;flagtx&lt;/P&gt;&lt;P&gt;1 &amp;nbsp; &amp;nbsp; chronic treatment &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;1&lt;/P&gt;&lt;P&gt;2 &amp;nbsp; &amp;nbsp; Treatment &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;1&lt;/P&gt;&lt;P&gt;3 &amp;nbsp; &amp;nbsp; acute/subacute tx &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;1&lt;/P&gt;&lt;P&gt;4 &amp;nbsp; &amp;nbsp; supplies-treatment &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 0&lt;/P&gt;&lt;P&gt;5 &amp;nbsp; &amp;nbsp; Supplies - Treament &amp;nbsp; &amp;nbsp; &amp;nbsp;0&lt;/P&gt;&lt;P&gt;6 &amp;nbsp; &amp;nbsp; apples &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;0&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;As you can see from my expected output, ID 1-3 were flagged in the new variable flagtx because they contain the word treatment or Treatment or tx. ID 4 &amp;amp;5 I want to exclude meaning that they are NOT flagged or are coded as a 0. ID 6 is given a value of 0 as well because it doesn't contain any "treatment" or "tx".&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Does that make sense?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 20 Jun 2016 20:11:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-find-strings-in-a-dataset/m-p/278823#M56124</guid>
      <dc:creator>christinagting0</dc:creator>
      <dc:date>2016-06-20T20:11:17Z</dc:date>
    </item>
    <item>
      <title>Re: How to find strings in a dataset?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-find-strings-in-a-dataset/m-p/278830#M56127</link>
      <description>&lt;P&gt;This works for your sample but may not for your full dataset. You may need to expand your rules...cleaning data is never fun.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you end up with a long list it may be worth setting up an array instead of searching for each term one at a time. But this should get you started.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input ID   Description $30.;
cards;
1     chronic treatment
2     Treatment
3     acute/subacute tx
4     supplies-treatment
5     Supplies - Treatment
6     apples
;
run;


data want;
set have;
description = upcase(description);

if find(description, 'TX') &amp;gt; 0 
or find(description, 'TREATMENT')&amp;gt;0 
	then flagtx=1;
else flagtx=0;

if find(description, 'SUPPLIES')&amp;gt;0 
or find(description, 'DISCHARGE')&amp;gt;0 
	then flagtx=0;


run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 20 Jun 2016 20:27:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-find-strings-in-a-dataset/m-p/278830#M56127</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2016-06-20T20:27:11Z</dc:date>
    </item>
    <item>
      <title>Re: How to find strings in a dataset?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-find-strings-in-a-dataset/m-p/278991#M56199</link>
      <description>&lt;P&gt;Thanks Reeza,&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Why did you use the UPCASE function. Does using this mean that the program will only find the strings in uppercase format?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;What if I wanted to find the strings in any variation? For example "TREATMENT" or "Treatment" or "treatment" or TrEATment"?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you for helping me out:)&lt;/P&gt;</description>
      <pubDate>Tue, 21 Jun 2016 12:09:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-find-strings-in-a-dataset/m-p/278991#M56199</guid>
      <dc:creator>christinagting0</dc:creator>
      <dc:date>2016-06-21T12:09:27Z</dc:date>
    </item>
    <item>
      <title>Re: How to find strings in a dataset?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-find-strings-in-a-dataset/m-p/279002#M56203</link>
      <description>&lt;P&gt;Upcase converts everything to capitals. This means you can to a single comparison rather than worry about testing for different cases.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 21 Jun 2016 12:40:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-find-strings-in-a-dataset/m-p/279002#M56203</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2016-06-21T12:40:13Z</dc:date>
    </item>
    <item>
      <title>Re: How to find strings in a dataset?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-find-strings-in-a-dataset/m-p/279004#M56204</link>
      <description>&lt;P&gt;OHH SMART!! THANK YOU VERY MUCH!&lt;/P&gt;</description>
      <pubDate>Tue, 21 Jun 2016 12:42:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-find-strings-in-a-dataset/m-p/279004#M56204</guid>
      <dc:creator>christinagting0</dc:creator>
      <dc:date>2016-06-21T12:42:12Z</dc:date>
    </item>
  </channel>
</rss>

