<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Removing varying junk text from a variable composed of a list of different text strings in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Removing-varying-junk-text-from-a-variable-composed-of-a-list-of/m-p/868604#M343131</link>
    <description>&lt;P&gt;Hi SAS Friends,&amp;nbsp;&lt;/P&gt;&lt;P&gt;Need to remove variable junk text from a long list of varying strings within a single text variable.&lt;/P&gt;&lt;P&gt;Here is a sample of the text variable with the junk text (HAVE),&lt;/P&gt;&lt;P&gt;What needs to be removed (REMOVE),&lt;/P&gt;&lt;P&gt;and what needs to be kept (WANT).&lt;/P&gt;&lt;P&gt;/***********************************************************/&lt;/P&gt;&lt;P&gt;data Have;&lt;BR /&gt;infile datalines truncover;&lt;BR /&gt;input Drug_Phrases $100.;&lt;BR /&gt;datalines;&lt;BR /&gt;2 days Ivermectin&lt;BR /&gt;2 mg moxidectin&lt;BR /&gt;20% deet insect repellent&lt;BR /&gt;3 days Ivermectin&lt;BR /&gt;3 drug dose - IDA with second dose of ivermectin&lt;BR /&gt;4 mg moxidectin&lt;BR /&gt;40 mg Atorvastatin/day for 120 days P.O.&lt;BR /&gt;400 μg/kg Ivermectin + 400 mg Albendazole&lt;BR /&gt;8 mg moxidectin&lt;BR /&gt;ABBV-4083&lt;BR /&gt;ALBENDAZOLE 400 Mg ORAL TABLET [ZENTEL]&lt;BR /&gt;;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;data Remove;&lt;BR /&gt;infile datalines truncover;&lt;BR /&gt;input Remove $100.;&lt;BR /&gt;datalines;&lt;BR /&gt;2 days&lt;BR /&gt;2 mg&lt;BR /&gt;20%&lt;BR /&gt;insect repellent&lt;BR /&gt;3 days&lt;BR /&gt;3 drug dose - IDA with second dose of&lt;BR /&gt;4 mg&lt;BR /&gt;40 mg&lt;BR /&gt;/day for 120 days P.O.&lt;BR /&gt;400 μg/kg&lt;BR /&gt;+ 400 mg&lt;BR /&gt;8 mg&lt;BR /&gt;400 Mg ORAL TABLET [ZENTEL]&lt;BR /&gt;;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;data Want;&lt;BR /&gt;infile datalines truncover;&lt;BR /&gt;input Drug_Names $100.;&lt;BR /&gt;datalines;&lt;BR /&gt;Ivermectin&lt;BR /&gt;moxidectin&lt;BR /&gt;deet&lt;BR /&gt;Ivermectin&lt;BR /&gt;ivermectin&lt;BR /&gt;moxidectin&lt;BR /&gt;Atorvastatin&lt;BR /&gt;Ivermectin Albendazole&lt;BR /&gt;moxidectin&lt;BR /&gt;ABBV-4083&lt;BR /&gt;ALBENDAZOLE&lt;BR /&gt;;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;/***************************************************/&lt;/P&gt;&lt;P&gt;I think, given the long list of strings involved, it might be easier to create a "REMOVE" file, listing the text to be removed, then somehow use that file to extract those strings from the HAVE text file.&amp;nbsp; But beyond that don't know where to start.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Appreciate any suggestions,&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you !&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Fri, 07 Apr 2023 14:18:27 GMT</pubDate>
    <dc:creator>rmacarthur</dc:creator>
    <dc:date>2023-04-07T14:18:27Z</dc:date>
    <item>
      <title>Removing varying junk text from a variable composed of a list of different text strings</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Removing-varying-junk-text-from-a-variable-composed-of-a-list-of/m-p/868604#M343131</link>
      <description>&lt;P&gt;Hi SAS Friends,&amp;nbsp;&lt;/P&gt;&lt;P&gt;Need to remove variable junk text from a long list of varying strings within a single text variable.&lt;/P&gt;&lt;P&gt;Here is a sample of the text variable with the junk text (HAVE),&lt;/P&gt;&lt;P&gt;What needs to be removed (REMOVE),&lt;/P&gt;&lt;P&gt;and what needs to be kept (WANT).&lt;/P&gt;&lt;P&gt;/***********************************************************/&lt;/P&gt;&lt;P&gt;data Have;&lt;BR /&gt;infile datalines truncover;&lt;BR /&gt;input Drug_Phrases $100.;&lt;BR /&gt;datalines;&lt;BR /&gt;2 days Ivermectin&lt;BR /&gt;2 mg moxidectin&lt;BR /&gt;20% deet insect repellent&lt;BR /&gt;3 days Ivermectin&lt;BR /&gt;3 drug dose - IDA with second dose of ivermectin&lt;BR /&gt;4 mg moxidectin&lt;BR /&gt;40 mg Atorvastatin/day for 120 days P.O.&lt;BR /&gt;400 μg/kg Ivermectin + 400 mg Albendazole&lt;BR /&gt;8 mg moxidectin&lt;BR /&gt;ABBV-4083&lt;BR /&gt;ALBENDAZOLE 400 Mg ORAL TABLET [ZENTEL]&lt;BR /&gt;;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;data Remove;&lt;BR /&gt;infile datalines truncover;&lt;BR /&gt;input Remove $100.;&lt;BR /&gt;datalines;&lt;BR /&gt;2 days&lt;BR /&gt;2 mg&lt;BR /&gt;20%&lt;BR /&gt;insect repellent&lt;BR /&gt;3 days&lt;BR /&gt;3 drug dose - IDA with second dose of&lt;BR /&gt;4 mg&lt;BR /&gt;40 mg&lt;BR /&gt;/day for 120 days P.O.&lt;BR /&gt;400 μg/kg&lt;BR /&gt;+ 400 mg&lt;BR /&gt;8 mg&lt;BR /&gt;400 Mg ORAL TABLET [ZENTEL]&lt;BR /&gt;;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;data Want;&lt;BR /&gt;infile datalines truncover;&lt;BR /&gt;input Drug_Names $100.;&lt;BR /&gt;datalines;&lt;BR /&gt;Ivermectin&lt;BR /&gt;moxidectin&lt;BR /&gt;deet&lt;BR /&gt;Ivermectin&lt;BR /&gt;ivermectin&lt;BR /&gt;moxidectin&lt;BR /&gt;Atorvastatin&lt;BR /&gt;Ivermectin Albendazole&lt;BR /&gt;moxidectin&lt;BR /&gt;ABBV-4083&lt;BR /&gt;ALBENDAZOLE&lt;BR /&gt;;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;/***************************************************/&lt;/P&gt;&lt;P&gt;I think, given the long list of strings involved, it might be easier to create a "REMOVE" file, listing the text to be removed, then somehow use that file to extract those strings from the HAVE text file.&amp;nbsp; But beyond that don't know where to start.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Appreciate any suggestions,&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you !&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 07 Apr 2023 14:18:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Removing-varying-junk-text-from-a-variable-composed-of-a-list-of/m-p/868604#M343131</guid>
      <dc:creator>rmacarthur</dc:creator>
      <dc:date>2023-04-07T14:18:27Z</dc:date>
    </item>
    <item>
      <title>Re: Removing varying junk text from a variable composed of a list of different text strings</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Removing-varying-junk-text-from-a-variable-composed-of-a-list-of/m-p/868610#M343137</link>
      <description>&lt;P&gt;This seems to work for most of your provided example:&lt;/P&gt;
&lt;P&gt;You may need to modify the "remove" values so that each "phrase" is a single value. If you phrase text looks like&lt;/P&gt;
&lt;P&gt;"blah blah Drugname other text" then the Remove values should be two lines like&lt;/P&gt;
&lt;P&gt;blah blah&lt;/P&gt;
&lt;P&gt;other text&lt;/P&gt;
&lt;P&gt;as the TRANWRD function used below replace exact matches and cannot infer you may have a value in the middle to work around.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;data _null_;
  set remove end=lastone;
  if _n_=1 then call execute('data want; set have;');
  call execute ('drug_phrases= tranwrd(drug_phrases,'||quote(strip(remove))||',"*");');
  if lastone then call execute('drug_phrases=strip(compress(drug_phrases,"*"));run;');
run;&lt;/PRE&gt;
&lt;P&gt;Call Execute creates statements that are placed in an execution buffer that runs after the data step calling it completes. This creates a data step. The _n_=1 is executed only one time to write the boiler plate to start the data step.&lt;/P&gt;
&lt;P&gt;The main bit repeated for each value in remove is to write a TRANWRD call replacing the text in the remove with an *. The Strip is needed because otherwise the || concatenate operator will pad the length of the value in remove with spaces and not actually match the value in the Drug_phrases variable. Quote is so the value of removed is valid for the syntax.&lt;/P&gt;
&lt;P&gt;When the last record is read from Remove, the Set statement option sets a variable to indicate this, then all the * are removed with the Compress function and the Strip makes sure the remaining text is left justified (no leading spaces).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You might consider modifying the code to actually write a Program file using the File statement and PUT instead of Call execute so you have a reference of the code created and then execute that program file.&lt;/P&gt;</description>
      <pubDate>Fri, 07 Apr 2023 14:49:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Removing-varying-junk-text-from-a-variable-composed-of-a-list-of/m-p/868610#M343137</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2023-04-07T14:49:41Z</dc:date>
    </item>
    <item>
      <title>Re: Removing varying junk text from a variable composed of a list of different text strings</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Removing-varying-junk-text-from-a-variable-composed-of-a-list-of/m-p/868613#M343139</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;SPAN&gt;ballardw,&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Thank you, this works great !&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;It will take some time to understand all the parts, but the output is correct&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;A very cool solution.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Thank you !&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 07 Apr 2023 14:59:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Removing-varying-junk-text-from-a-variable-composed-of-a-list-of/m-p/868613#M343139</guid>
      <dc:creator>rmacarthur</dc:creator>
      <dc:date>2023-04-07T14:59:37Z</dc:date>
    </item>
  </channel>
</rss>

