<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to extract 6 different test results from a text field in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/How-to-extract-6-different-test-results-from-a-text-field/m-p/703209#M215444</link>
    <description>&lt;P&gt;May I assume that the prefix&amp;nbsp;&lt;STRONG&gt;Perfluoro&lt;/STRONG&gt; appears only at start of the text to be extracted and not in between extracted texts ?&lt;/P&gt;
&lt;P&gt;Are All acids and is the extracted text always in a format of "&lt;STRONG&gt;Perfluoro&lt;/STRONG&gt;&amp;nbsp;....&amp;nbsp;&lt;STRONG&gt;Acid&amp;nbsp;&lt;/STRONG&gt;&amp;lt;amount&amp;gt;&amp;nbsp;&lt;STRONG&gt;ng/ml" ?&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If positive then next code (not tested) may help:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;
data want;
 set have;
     length text1 $32000 
	        searc_for $9
			out_var1 $30
			out_value  8
			out_unit  $5
		;
	flag=1;
	search_for = 'Perfluoro';
	text1 = substr(result_notes,index(result_notes,search_for));
	do until falg=0;
	   out_var1 = scan(text1,1)|| 'Acid';
	   out_value = input(scan(text1,3),best4.);
	   out_unit = scan(text1,4,' .');
	   output; 
	   keep out_var1 out_value out_unit;
	   
	   ix = index(text1,out_unit) + length(out_unit) +1;
	   text1 = substr(text1,index(result_notes,search_for));
	   if text1 = '' then flag=0;
	end;
run;
	   
	 &lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Wed, 02 Dec 2020 21:48:25 GMT</pubDate>
    <dc:creator>Shmuel</dc:creator>
    <dc:date>2020-12-02T21:48:25Z</dc:date>
    <item>
      <title>How to extract 6 different test results from a text field</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-extract-6-different-test-results-from-a-text-field/m-p/703194#M215437</link>
      <description>&lt;P&gt;&lt;SPAN&gt;&lt;SPAN&gt;I need to find the most efficient way possible to extract 6 test results that are populated in one text field.&amp;nbsp; The variable is called RESULT_NOTES and the general format of the variable looks something like this (patterns vary a little for each record- just an example of 1, also all of the numbers are made up in this example):&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&lt;SPAN&gt;&lt;FONT face="Times New Roman"&gt;&lt;BR /&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN&gt;&lt;FONT face="Times New Roman"&gt;&lt;SPAN&gt;&lt;FONT&gt;&amp;nbsp;&lt;/FONT&gt;&lt;/SPAN&gt; &lt;SPAN&gt;&lt;FONT&gt;&lt;SPAN style="font-family: 'Calibri',sans-serif; mso-bidi-font-weight: bold;"&gt;RESULT COMMENT(S): ---------------------------------------------------------------------~ TEST NAME RESULT ALERT UOM PERFORMED AT~---------------------------------------------------------------------~&lt;STRONG&gt;Perfluorobutanesulfonic Acid 10 ng/mL &lt;/STRONG&gt;01~Synonym(s): PFBS~Population reference interval derived from BBC Laboratories~data (n=300) is usually less than 20 ng/mL~(90% CI, &amp;lt;0.&lt;SPAN style="line-height: 107%; font-family: 'Calibri',sans-serif; font-size: 11pt; mso-bidi-font-weight: bold; mso-fareast-font-family: Calibri; mso-fareast-theme-font: minor-latin; mso-ansi-language: EN-US; mso-fareast-language: EN-US; mso-bidi-language: AR-SA;"&gt;&lt;FONT&gt;0.050 &lt;/FONT&gt;&lt;/SPAN&gt;- 0.23 ng/mL) (97.5th percentile)~General U.S. population from CDC-NHANES (2013-2014)~(n=3000) (isomers not described) is typically~below 0.1 ng/mL (95th percentile)~ -----~&lt;STRONG&gt;Perfluoroheptanoic Acid 8 ng/mL &lt;/STRONG&gt;01~Synonym(s): PFHpA~Population reference interval derived from BBC Laboratories~data (n=300) is usually less than 97 ng/mL~(90% CI, 0.25 - 0.73 ng/mL) (97.5th percentile)~General U.S. population from CDC-NHANES (2013-2014)~(n=2168) (isomers not described) is typically~below 0.20 ng/mL (95% CI, 0.10 - 0.20 ng/mL)~(95th percentile)~ -----~&lt;STRONG&gt;Perfluorohexanesulfonic Acid 16 ng/mL&lt;/STRONG&gt; 01~Synonym(s): PFHxS~Population reference interval derived from BBC Laboratories~data (n=300) is usually less than 30&amp;nbsp; ng/mL~(90% CI, 4.1 - 17 ng/mL) (97.5th percentile)~General U.S. population from CDC-NHANES (2015-2016)~(n=1993) (isomers not described) is typically~below 4.9 ng/mL (95% CI, 4.1 - 5.8 ng/mL)~(95th percentile)~ -----~&lt;STRONG&gt;Perfluorooctanoic Acid 20 ng/mL &lt;/STRONG&gt;01~Synonym(s): PFOA; FC-143 Component~Population reference interval derived from~BBC Laboratories data (n=300) is usually less than 4.1 ng/mL~(90% CI, 3.3 - 8.0 ng/mL) (97.5th percentile)~General U.S. population from CDC-NHANES (2015-2016)~(n=1993) for the linear isomer is typically below~8 ng/mL (95% CI,7.6-10.3 ng/mL) (95th percentile)~below 1.9 ng/mL (95% CI, 1.5 - 2.2 ng/mL)~(95th percentile)~ -----~&lt;STRONG&gt;Perfluorononanoic Acid 5 ng/mL&lt;/STRONG&gt; 01~Synonym(s): PFNA~Population reference interval derived from BBC Laboratories~data (n=300) is usually less than 1.4 ng/mL~(90% CI, 1.2 - 2.3 ng/mL) (97.5th percentile)~General U.S. population from CDC-NHANES (2015-2016)~(n=1993) (isomers not described) is typically~below 1.9 ng/mL (95% CI, 1.5 - 2.2 ng/mL)~(95th percentile)~ -----~&lt;STRONG&gt;Perfluorooctanesulfonic Acid 12 ng/mL&lt;/STRONG&gt; 01~Please be advised that this report contains fictitious~results to be used for interface program validation.~This report does not contain real patient results.~Synonym(s): PFOS~Population reference interval derived from BBC Laboratories~data (n=300) is usually less than 12 ng/mL~(90% CI, 7.7 - 15 ng/mL) (97.5th percentile)~General U.S. population from CDC-NHANES (2015-2016)~(n=1993) for the linear isomer is typically~below 13 ng/mL (95% CI, 10 - 18 ng/mL)~(95th percentile)~This test was developed and its performance characteristics~determined by BBC Laboratories. It has not been cleared or approved~by the US Food and Drug Administration.~ -----~---------------------------------------------------------------------~ ALERT/ABNORMAL FLAG LEGEND:~ L= Below Low Normal, H= Above High Normal, LL= Alert Low~ HH= Alert High, &amp;lt;= Panic Low, &amp;gt;= Panic High, A= Abnormal~---------------------------------------------------------------------&lt;/SPAN&gt; &lt;/FONT&gt;&lt;/SPAN&gt;&lt;SPAN&gt;&lt;SPAN&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&lt;FONT face="Times New Roman"&gt;&lt;SPAN&gt;&lt;SPAN&gt;Ideally, I would need to create a variable for each compound, with the numeric value and then a separate variable for the units an example below:&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&lt;FONT face="Times New Roman"&gt;&lt;SPAN&gt;&lt;SPAN&gt;&lt;STRONG&gt;&lt;FONT face="Calibri"&gt;&lt;STRONG&gt;Perfluorobutanesulfonic_Value&amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;/STRONG&gt;&lt;STRONG&gt;Perfluorobutanesulfonic_Units&amp;nbsp; &amp;nbsp; &amp;nbsp;etc...…(for each compound bolded above)&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/STRONG&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&lt;FONT face="Times New Roman"&gt;&lt;SPAN&gt;&lt;SPAN&gt;&lt;STRONG&gt;&lt;FONT face="Calibri"&gt;&lt;STRONG&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;10&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; ng/ml&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you for any help or suggestions!&lt;/P&gt;&lt;DIV class="mce-pastebin"&gt;%MCEPASTEBIN%&lt;/DIV&gt;&lt;DIV class="mce-pastebin"&gt;%MCEPASTEBIN&lt;/DIV&gt;</description>
      <pubDate>Wed, 02 Dec 2020 20:44:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-extract-6-different-test-results-from-a-text-field/m-p/703194#M215437</guid>
      <dc:creator>sas21</dc:creator>
      <dc:date>2020-12-02T20:44:29Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract 6 different test results from a text field</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-extract-6-different-test-results-from-a-text-field/m-p/703209#M215444</link>
      <description>&lt;P&gt;May I assume that the prefix&amp;nbsp;&lt;STRONG&gt;Perfluoro&lt;/STRONG&gt; appears only at start of the text to be extracted and not in between extracted texts ?&lt;/P&gt;
&lt;P&gt;Are All acids and is the extracted text always in a format of "&lt;STRONG&gt;Perfluoro&lt;/STRONG&gt;&amp;nbsp;....&amp;nbsp;&lt;STRONG&gt;Acid&amp;nbsp;&lt;/STRONG&gt;&amp;lt;amount&amp;gt;&amp;nbsp;&lt;STRONG&gt;ng/ml" ?&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If positive then next code (not tested) may help:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;
data want;
 set have;
     length text1 $32000 
	        searc_for $9
			out_var1 $30
			out_value  8
			out_unit  $5
		;
	flag=1;
	search_for = 'Perfluoro';
	text1 = substr(result_notes,index(result_notes,search_for));
	do until falg=0;
	   out_var1 = scan(text1,1)|| 'Acid';
	   out_value = input(scan(text1,3),best4.);
	   out_unit = scan(text1,4,' .');
	   output; 
	   keep out_var1 out_value out_unit;
	   
	   ix = index(text1,out_unit) + length(out_unit) +1;
	   text1 = substr(text1,index(result_notes,search_for));
	   if text1 = '' then flag=0;
	end;
run;
	   
	 &lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 02 Dec 2020 21:48:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-extract-6-different-test-results-from-a-text-field/m-p/703209#M215444</guid>
      <dc:creator>Shmuel</dc:creator>
      <dc:date>2020-12-02T21:48:25Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract 6 different test results from a text field</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-extract-6-different-test-results-from-a-text-field/m-p/703211#M215445</link>
      <description>&lt;P&gt;Here's a start, anyway. This code assumes your data is in a text file named "remarks.txt" located in the SAS default directory.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data intermediate;
   drop _:;
   retain __rxID 0;
   length _Found $ 32767 Name $50 Units $ 15 Value 8;
   if _n_=1 then do;
      _rxID=prxparse('/-~(\w+\s?\w*\/?\w*\s)+01~/');
      put 'NOTE: ' _rxID=;
   end;
   infile "remarks.txt";
   input;
   _start = 1;
   _stop = length(_infile_);
   call prxnext(_rxID, _start, _stop, _infile_, _position, _length);
      do while (_position &amp;gt; 0);
         _found = substr(_infile_, _position+2, _length-3);
         Units=scan(_found,-2,' ');
         Value=input(scan(_found,-3,' '),10.);
         Name=substr(_found,1,anydigit(_found)-2);
         output;
         call prxnext(_rxID, _start, _stop, _infile_, _position, _length);
      end;
run;

%macro Fixnames;
proc sql noprint;
select distinct
       Name
      ,cats(scan(Name,1),'_Value')
      ,cats(scan(Name,1),'_Units')
  into :Name1- , :val1- , :unit1-
  from intermediate
  order by 1
;
quit;

data want;
  retain 
     %do i=1 %to &amp;amp;sqlobs;
     &amp;amp;&amp;amp;val&amp;amp;i 0 
     &amp;amp;&amp;amp;unit&amp;amp;i '          '
     %end;
  ;
  set intermediate end=last;
  %do i=1 %to &amp;amp;sqlobs;
  if Name="&amp;amp;&amp;amp;name&amp;amp;i" then do;
     &amp;amp;&amp;amp;val&amp;amp;i=value;
     &amp;amp;&amp;amp;unit&amp;amp;i=units;
  end;
  %end;
  if last then output;
run;
%mend;

%fixnames
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;The program produces this result:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;DIV class="branch"&gt;
&lt;DIV&gt;
&lt;DIV align="center"&gt;
&lt;TABLE class="table" summary="Procedure Print: Data Set WORK.WANT" frame="box" rules="all" cellspacing="0" cellpadding="5"&gt;
&lt;THEAD&gt;
&lt;TR&gt;
&lt;TH class="r header" scope="col"&gt;Obs&lt;/TH&gt;
&lt;TH class="r header" scope="col"&gt;Perfluorobutanesulfonic_Value&lt;/TH&gt;
&lt;TH class="l header" scope="col"&gt;Perfluorobutanesulfonic_Units&lt;/TH&gt;
&lt;TH class="r header" scope="col"&gt;Perfluoroheptanoic_Value&lt;/TH&gt;
&lt;TH class="l header" scope="col"&gt;Perfluoroheptanoic_Units&lt;/TH&gt;
&lt;TH class="r header" scope="col"&gt;Perfluorohexanesulfonic_Value&lt;/TH&gt;
&lt;TH class="l header" scope="col"&gt;Perfluorohexanesulfonic_Units&lt;/TH&gt;
&lt;TH class="r header" scope="col"&gt;Perfluorononanoic_Value&lt;/TH&gt;
&lt;TH class="l header" scope="col"&gt;Perfluorononanoic_Units&lt;/TH&gt;
&lt;TH class="r header" scope="col"&gt;Perfluorooctanesulfonic_Value&lt;/TH&gt;
&lt;TH class="l header" scope="col"&gt;Perfluorooctanesulfonic_Units&lt;/TH&gt;
&lt;TH class="r header" scope="col"&gt;Perfluorooctanoic_Value&lt;/TH&gt;
&lt;TH class="l header" scope="col"&gt;Perfluorooctanoic_Units&lt;/TH&gt;
&lt;/TR&gt;
&lt;/THEAD&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TH class="r rowheader" scope="row"&gt;1&lt;/TH&gt;
&lt;TD class="r data"&gt;10&lt;/TD&gt;
&lt;TD class="l data"&gt;ng/mL&lt;/TD&gt;
&lt;TD class="r data"&gt;8&lt;/TD&gt;
&lt;TD class="l data"&gt;ng/mL&lt;/TD&gt;
&lt;TD class="r data"&gt;16&lt;/TD&gt;
&lt;TD class="l data"&gt;ng/mL&lt;/TD&gt;
&lt;TD class="r data"&gt;5&lt;/TD&gt;
&lt;TD class="l data"&gt;ng/mL&lt;/TD&gt;
&lt;TD class="r data"&gt;12&lt;/TD&gt;
&lt;TD class="l data"&gt;ng/mL&lt;/TD&gt;
&lt;TD class="r data"&gt;20&lt;/TD&gt;
&lt;TD class="l data"&gt;ng/mL&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P&gt;&lt;BR /&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 02 Dec 2020 21:54:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-extract-6-different-test-results-from-a-text-field/m-p/703211#M215445</guid>
      <dc:creator>SASJedi</dc:creator>
      <dc:date>2020-12-02T21:54:14Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract 6 different test results from a text field</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-extract-6-different-test-results-from-a-text-field/m-p/703217#M215448</link>
      <description>&lt;P&gt;A typical strategy:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data wantList;
set have;
stringId = _n_;
do compound = 
    "Perfluorobutanesulfonic Acid", 
    "Perfluoroheptanoic Acid",
    "Perfluorohexanesulfonic Acid";
    pos = find(text, compound, "it");
    if pos &amp;gt; 0 then do;
        pos = pos + length(compound);
        length var $26; /* 32 - length("_units") */
        var = scan(compound, 1);
        value = input(scan(substr(text, pos), 1, " "), best.);
        units = scan(substr(text, pos), 2, " ");
        output;
        end;
    end;
drop text pos;
run;

proc transpose data=wantList out=wantvalues(drop=_name_) suffix=_value;
by stringId;
var value;
id var;
run;

proc transpose data=wantList out=wantunits(drop=_name_) suffix=_units;
by stringId;
var units;
id var;
run;

data want;
merge wantValues wantUnits;
by stringId;
drop stringId;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;You need two transpose steps because &lt;EM&gt;values&lt;/EM&gt; are numeric and &lt;EM&gt;units&lt;/EM&gt; are character.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 02 Dec 2020 22:16:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-extract-6-different-test-results-from-a-text-field/m-p/703217#M215448</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2020-12-02T22:16:48Z</dc:date>
    </item>
  </channel>
</rss>

