<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Remove words with numbers in them from a string variable in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Remove-words-with-numbers-in-them-from-a-string-variable/m-p/616131#M180323</link>
    <description>&lt;P&gt;Please post the data you have in usable form (data step with datalines) so that we know exactly what you have.&lt;/P&gt;
&lt;P&gt;The following step remove all two-letter words:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data narf;
   length Text $ 200;
   input Text &amp;amp;;

   output;
   Text = prxchange('s/\b(\w\w)\b/ /', -1, Text);
   output;

   datalines;
If you use regular-expression-id, the PRXCHANGE function searches the variable source with the regular-expression-id that is returned by PRXPARSE. 
It returns the value in source with the changes that were specified by the regular expression. 
If there is no match, PRXCHANGE returns the unchanged value in source. 
run;
&lt;/CODE&gt;&lt;/PRE&gt;</description>
    <pubDate>Thu, 09 Jan 2020 06:56:39 GMT</pubDate>
    <dc:creator>andreas_lds</dc:creator>
    <dc:date>2020-01-09T06:56:39Z</dc:date>
    <item>
      <title>Remove words with numbers in them from a string variable</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Remove-words-with-numbers-in-them-from-a-string-variable/m-p/616082#M180303</link>
      <description>&lt;P&gt;Hey guys,&lt;/P&gt;&lt;P&gt;I need to remove the words (or terms) with numbers in them from a string. I have tried compress, translate, tranwrd and prxchange but no luck.&lt;/P&gt;&lt;P&gt;Codes I have used:&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data cleaned;
        set &amp;amp;SRC_DTST;
		NO_NUM_SCHAR=COMPBL(TRANSLATE(upcase(&amp;amp;COLUMN), " " , ".,;:?!-/\+[]%1234567890$@#){}'|^&amp;amp;~*&amp;lt;&amp;gt;("));
		NO_NUM_SCHAR=COMPRESS(NO_NUM_SCHAR,,'KAW');
		NO_NUM_SCHAR = prxchange('s/\s+/ /oi',-1,trim(NO_NUM_SCHAR));
		NO_NUM_SCHAR = TRANWRD(NO_NUM_SCHAR, '09'x, '');
        NO_STP_WRD=prxchange('s/\b(JR|SR|III|IV|DECD|THE|A|AN|I|HE|SHE|WE|IT|THEM|TO|AND|AS|OF|FROM|TO|ABOARD|IF|II|IV|OR|NON|ABOUT|HAVE|HAD|HOW|ONE|
                NOT|BEEN|ABOVE|ACROSS|AFTER|AGAINST|ALONG|AMID|AMONG|ANTI|AROUND|AS|AT|BEFORE|BEHIND|BELOW|BENEATH|BESIDE|BESIDES|BETWEEN|
				BEYOND|BUT|BY|CONCERNING|CONSIDERING|DESPITE|DOWN|DURING|EXCEPT|EXCEPTING|EXCLUDING|FOLLOWING|FOR|FROM|IN|INSIDE|INTO|LIKE|
				MINUS|NEAR|OF|OFF|ON|ONTO|OPPOSITE|OUTSIDE|OVER|PAST|PER|PLUS|REGARDING|ROUND|SAVE|SINCE|THAN|THROUGH|TO|TOWARD|TOWARDS|
				UNDER|UNDERNEATH|UNLIKE|UNTIL|UP|UPON|VERSUS|VIA|WITH|WITHIN|WITHOUT|FULL|TYPE|NONE|OTHER|MUST|NON|B|C|D|E|F|G|H|I|J|K|L|
				M|N|O|P|Q|R|S|T|U|V|W|X|Y|Z)\b/ /o',-1,NO_NUM_SCHAR);
        cleaned_desc = COMPBL(STRIP(NO_STP_WRD));
		DROP NO_NUM_SCHAR NO_STP_WRD;
    run; &lt;BR /&gt;&lt;BR /&gt;/* next step is for removing duplicate words*/&lt;BR /&gt;data cleaned(keep=Concatenated_Categories LOV_LONG_DSC cleaned_desc);&lt;BR /&gt;set cleaned;&lt;BR /&gt;newstring=scan(cleaned_desc, 1, ' ');&lt;BR /&gt;do i=2 to countw(cleaned_desc,' ');&lt;BR /&gt;word=scan(cleaned_desc, i, ' ');&lt;BR /&gt;found=find(newstring, word, 'it');&lt;BR /&gt;if found=0 then newstring=catx(' ', newstring, word);&lt;BR /&gt;end;&lt;BR /&gt;cleaned_desc= newstring;&lt;BR /&gt;DROP newstring;&lt;BR /&gt;run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Input:&lt;/P&gt;&lt;P&gt;ATN1 (atrophin 1) (eg, dentatorubral-pallidoluysian atrophy) gene analysis, evaluation to detect abnormal (eg, expanded) alleles&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My output:&lt;/P&gt;&lt;P&gt;ATN ATROPHIN EG DENTATORUBRAL PALLIDOLUYSIAN ATROPHY GENE ANALYSIS EVALUATION DETECT ABNORMAL EXPANDED ALLELES&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Expected output:&lt;/P&gt;&lt;P&gt;ATROPHIN EG DENTATORUBRAL PALLIDOLUYSIAN ATROPHY GENE ANALYSIS EVALUATION DETECT ABNORMAL EXPANDED ALLELES&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Also good to have:&lt;/P&gt;&lt;P&gt;I also want to remove any 2 letter words from the string such as 'EG' in this case.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any guidance will be greatly appreciated.&lt;/P&gt;</description>
      <pubDate>Wed, 08 Jan 2020 21:54:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Remove-words-with-numbers-in-them-from-a-string-variable/m-p/616082#M180303</guid>
      <dc:creator>mosabbirfardin</dc:creator>
      <dc:date>2020-01-08T21:54:56Z</dc:date>
    </item>
    <item>
      <title>Re: Remove words with numbers in them from a string variable</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Remove-words-with-numbers-in-them-from-a-string-variable/m-p/616131#M180323</link>
      <description>&lt;P&gt;Please post the data you have in usable form (data step with datalines) so that we know exactly what you have.&lt;/P&gt;
&lt;P&gt;The following step remove all two-letter words:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data narf;
   length Text $ 200;
   input Text &amp;amp;;

   output;
   Text = prxchange('s/\b(\w\w)\b/ /', -1, Text);
   output;

   datalines;
If you use regular-expression-id, the PRXCHANGE function searches the variable source with the regular-expression-id that is returned by PRXPARSE. 
It returns the value in source with the changes that were specified by the regular expression. 
If there is no match, PRXCHANGE returns the unchanged value in source. 
run;
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 09 Jan 2020 06:56:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Remove-words-with-numbers-in-them-from-a-string-variable/m-p/616131#M180323</guid>
      <dc:creator>andreas_lds</dc:creator>
      <dc:date>2020-01-09T06:56:39Z</dc:date>
    </item>
    <item>
      <title>Re: Remove words with numbers in them from a string variable</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Remove-words-with-numbers-in-them-from-a-string-variable/m-p/616155#M180338</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/254232"&gt;@mosabbirfardin&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Your requirements need 3 steps.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In the first step anyone character in '()-,.' is replaced by a SPACE.&lt;/P&gt;
&lt;P&gt;The second step will look for 2-character word to replace it by a SPACE.&lt;/P&gt;
&lt;P&gt;The third step looks for a word ending in a NUMBER to replace it by a SPACE.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data _null_;
txt = 'ATN1 (atrophin 1) (eg, dentatorubral-pallidoluysian atrophy) gene analysis, 
evaluation to detect abnormal (eg, expanded) alleles';
txt = translate(txt,' ','()-,.');
wc = countw(txt);
do i = 1 to wc;
   word = scan(txt, i);
   if length(word) = 2 then txt = transtrn(txt, strip(word), strip(' ')); 
   else if anydigit(word) then txt = transtrn(txt, strip(word), strip(' '));
end;
   txt = upcase(compbl(txt));
   put txt =;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 09 Jan 2020 09:52:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Remove-words-with-numbers-in-them-from-a-string-variable/m-p/616155#M180338</guid>
      <dc:creator>KachiM</dc:creator>
      <dc:date>2020-01-09T09:52:34Z</dc:date>
    </item>
  </channel>
</rss>

