<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Extracting part of a string in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Extracting-part-of-a-string/m-p/445034#M111476</link>
    <description>&lt;P&gt;Hi everyone&lt;/P&gt;
&lt;P&gt;I need help with 2 issues&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1. I have a file containing thousands of enteries that look like this:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;PatientFirstName;PatientLastName;PatientDateOfBirth;PatientID;HomeDoctorSalutation;HomeDoctorFirstName;HomeDoctorLastName;ExamType;ExamDate;ExaminerSalutation;ExaminerFirstName;ExaminerLastName; Comment 3;Image Comment 4;_x000D_&lt;/P&gt;
&lt;P&gt;"&lt;FONT color="#FF0000"&gt;John&lt;/FONT&gt;";"&lt;FONT color="#FF0000"&gt;Doe&lt;/FONT&gt;";"&lt;FONT color="#FF0000"&gt;01/01/99&lt;/FONT&gt;";"&lt;FONT color="#FF0000"&gt;9999999&lt;/FONT&gt;";"MS";"Alex";"Josef";"surgery";"&lt;FONT color="#FF0000"&gt;01/01/17&lt;/FONT&gt;";"MR";"John";"Watson";"Mr";"Sherlock";"Holmes";"GX";"11111";"";"";"";"";"";"";&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I need to extract the parts I marked in red, basically, the first 4 entries between citation marks, and the 9th one. They are always in that order and always between citations&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Then Another file with entries as follows:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;X:\&lt;FONT color="#FF0000"&gt;1111111&lt;/FONT&gt;_5748392_222222222_DDDDD.PDF&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I also need the entry I marked in red, basically the one between "\" and "_"&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I know this is doable with regular expression, but it would probably take me a couples of days to figure out a solution, so reaching out to the community for some help please...&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Kind regards&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;AM&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Tue, 13 Mar 2018 05:23:02 GMT</pubDate>
    <dc:creator>ammarhm</dc:creator>
    <dc:date>2018-03-13T05:23:02Z</dc:date>
    <item>
      <title>Extracting part of a string</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Extracting-part-of-a-string/m-p/445034#M111476</link>
      <description>&lt;P&gt;Hi everyone&lt;/P&gt;
&lt;P&gt;I need help with 2 issues&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1. I have a file containing thousands of enteries that look like this:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;PatientFirstName;PatientLastName;PatientDateOfBirth;PatientID;HomeDoctorSalutation;HomeDoctorFirstName;HomeDoctorLastName;ExamType;ExamDate;ExaminerSalutation;ExaminerFirstName;ExaminerLastName; Comment 3;Image Comment 4;_x000D_&lt;/P&gt;
&lt;P&gt;"&lt;FONT color="#FF0000"&gt;John&lt;/FONT&gt;";"&lt;FONT color="#FF0000"&gt;Doe&lt;/FONT&gt;";"&lt;FONT color="#FF0000"&gt;01/01/99&lt;/FONT&gt;";"&lt;FONT color="#FF0000"&gt;9999999&lt;/FONT&gt;";"MS";"Alex";"Josef";"surgery";"&lt;FONT color="#FF0000"&gt;01/01/17&lt;/FONT&gt;";"MR";"John";"Watson";"Mr";"Sherlock";"Holmes";"GX";"11111";"";"";"";"";"";"";&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I need to extract the parts I marked in red, basically, the first 4 entries between citation marks, and the 9th one. They are always in that order and always between citations&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Then Another file with entries as follows:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;X:\&lt;FONT color="#FF0000"&gt;1111111&lt;/FONT&gt;_5748392_222222222_DDDDD.PDF&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I also need the entry I marked in red, basically the one between "\" and "_"&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I know this is doable with regular expression, but it would probably take me a couples of days to figure out a solution, so reaching out to the community for some help please...&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Kind regards&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;AM&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 13 Mar 2018 05:23:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Extracting-part-of-a-string/m-p/445034#M111476</guid>
      <dc:creator>ammarhm</dc:creator>
      <dc:date>2018-03-13T05:23:02Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting part of a string</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Extracting-part-of-a-string/m-p/445163#M111521</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;you can achieve&amp;nbsp;what you want with this piece of code:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;/*
I need to extract the parts I marked in red, basically, the first 4 entries between citation marks
and the 9th one. They are always in that order and always between citations
*/
proc import datafile="X:\YourFile.txt"
     out=test1
     dbms=csv
     replace;
     getnames=yes;
     delimiter=';';
;
run;

data test2;
   set test1;
   keep PatientFirstName PatientLastName PatientDateOfBirth PatientID ExamDate;
run;

/*
I also need the entry I marked in red, basically the one between "\" and "_"
*/
filename files pipe 'dir  "X:\*.pdf" /b/a-d ' lrecl=5000;
data a;
infile files truncover;
input files $char1000.;
files=scan(files,1,'_');
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 13 Mar 2018 13:32:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Extracting-part-of-a-string/m-p/445163#M111521</guid>
      <dc:creator>Oligolas</dc:creator>
      <dc:date>2018-03-13T13:32:22Z</dc:date>
    </item>
  </channel>
</rss>

