<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Hash find vs datastep find in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Hash-find-vs-datastep-find/m-p/443658#M111017</link>
    <description>&lt;P&gt;Posting .sas7bdat files may often not be helpful, as encodings or byte order may prevemt users from using them. That's why we STRONGLY recommend to post example data in a data step with datalines, as code can be simply copy/pasted and submitted on any environment.&lt;/P&gt;
&lt;P&gt;A macro to do that conversion automatically can be found here: &lt;A href="https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-data-AKA-generate/ta-p/258712" target="_blank"&gt;https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-data-AKA-generate/ta-p/258712&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 08 Mar 2018 07:18:49 GMT</pubDate>
    <dc:creator>Kurt_Bremser</dc:creator>
    <dc:date>2018-03-08T07:18:49Z</dc:date>
    <item>
      <title>Hash find vs datastep find</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Hash-find-vs-datastep-find/m-p/443263#M110878</link>
      <description>&lt;P&gt;I have two datasets. One has clean variable text another has uncleaned with page numbers. I want to identify clean text into uncleaned text, like find function in data step. is there any solution?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have tried below code it is not searching like index/find in data step.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;P.S: I do not have any common variable to merge in datastep. I choose hash table to find page numbers from another dataset. I can not use Perl regular expression since variable acrf is unstructured.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data final;
    if _n_ eq 1 then do;
        if 0 then set acrftxt;
        dcl hash h(dataset:'acrftxt');
        h.definekey('acrf');
        h.definedata('pages');
        h.definedone();
    end;
    set gdb;
    if h.find(key: question) eq 0 then page=pages;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;acrftxt&amp;nbsp;dataset eg&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data ACRFTXT;&lt;BR /&gt; infile datalines dsd truncover;&lt;BR /&gt; input acrf:$32767. pages:$12.;&lt;BR /&gt;datalines4;&lt;BR /&gt;19. Preferred Term Code [hidden],1&lt;BR /&gt;[Preferred Term Code],1&lt;BR /&gt;20. High Level Term Code [hidden],1&lt;BR /&gt;&amp;#12;,2&lt;BR /&gt;[High Level Term Code],2&lt;BR /&gt;,2&lt;BR /&gt;21.,2&lt;BR /&gt;,2&lt;BR /&gt;High Level Group Term Code [hidden],2&lt;BR /&gt;,2&lt;BR /&gt;[High Level Group Term Code],2&lt;BR /&gt;,2&lt;BR /&gt;22.,2&lt;BR /&gt;,2&lt;BR /&gt;Body System or Organ Class Code [hidden],2&lt;BR /&gt;1.,4&lt;BR /&gt;2.,4&lt;BR /&gt;3.,4&lt;BR /&gt;4.,4&lt;BR /&gt;5.,4&lt;BR /&gt;6.,4&lt;BR /&gt;7.,4&lt;BR /&gt;,4&lt;BR /&gt;,4&lt;BR /&gt;Microbiology Subcategory,4&lt;BR /&gt;,4&lt;BR /&gt;[Microbiology Subcategory],4&lt;BR /&gt;,4&lt;BR /&gt;Specimen Type,4&lt;BR /&gt;,4&lt;BR /&gt;[Specimen Type],4&lt;BR /&gt;,4&lt;BR /&gt;What was the site of specimen?,4&lt;BR /&gt;,4&lt;BR /&gt;[Site of Specimen],4&lt;BR /&gt;9.,4&lt;BR /&gt;10.,4&lt;BR /&gt;11.,4&lt;BR /&gt;12.,4&lt;BR /&gt;13.,4&lt;BR /&gt;14.,4&lt;BR /&gt;,4&lt;BR /&gt;,4&lt;BR /&gt;D,4&lt;BR /&gt;,4&lt;BR /&gt;,4&lt;BR /&gt;Were any isolates obtained from this specimen?,4&lt;BR /&gt;,4&lt;BR /&gt;"If a Gram-positive pathogen was cultured, is it vancomycin-susceptible?(only applies to Enterococcus, Pediococcus, Lactobacillus or Leuconostoc) [hidden]",4&lt;BR /&gt;,4&lt;BR /&gt;"If the pathogen cultured is Staphylococcus aureus, is it oxacillin-susceptible? (methicillin-susceptible) [hidden]",4&lt;BR /&gt;,4&lt;BR /&gt;[Oxacillin-susceptible],5&lt;BR /&gt;,5&lt;BR /&gt;"If a Gram-negative pathogen was cultured, is it aztreonam susceptible? [hidden]",5&lt;BR /&gt;,5&lt;BR /&gt;Organism Genus/Species,5&lt;BR /&gt;,5&lt;BR /&gt;Entry,5&lt;BR /&gt;,5&lt;BR /&gt;Organism Genus/Species,5&lt;BR /&gt;,5&lt;BR /&gt;[Organism Genus/Species],5&lt;BR /&gt;;;;;&lt;BR /&gt;run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;gdb&amp;nbsp;dataset eg&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;data GDB;&lt;BR /&gt; infile datalines dsd truncover;&lt;BR /&gt; input question:$200.;&lt;BR /&gt;datalines4;&lt;BR /&gt;Preferred Term Code [hidden]&lt;BR /&gt;High Level Term Code [hidden]&lt;BR /&gt;High Level Group Term Code [hidden]&lt;BR /&gt;Body System or Organ Class Code [hidden]&lt;BR /&gt;Microbiology Subcategory&lt;BR /&gt;What was the site of specimen?&lt;BR /&gt;Specimen Type&lt;BR /&gt;"If a Gram-positive pathogen was cultured, is it vancomycin-susceptible?(only applies to Enterococcus, Pediococcus, Lactobacillus or Leuconostoc) [hidden]"&lt;BR /&gt;Were any isolates obtained from this specimen?&lt;BR /&gt;Organism Genus/Species&lt;BR /&gt;"If the pathogen cultured is Staphylococcus aureus, is it oxacillin-susceptible? (methicillin-susceptible) [hidden]"&lt;BR /&gt;"If a Gram-negative pathogen was cultured, is it aztreonam susceptible? [hidden]"&lt;BR /&gt;;;;;&lt;BR /&gt;run;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 09 Mar 2018 08:47:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Hash-find-vs-datastep-find/m-p/443263#M110878</guid>
      <dc:creator>Rajaram</dc:creator>
      <dc:date>2018-03-09T08:47:01Z</dc:date>
    </item>
    <item>
      <title>Re: Hash find vs datastep find</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Hash-find-vs-datastep-find/m-p/443266#M110880</link>
      <description>&lt;P&gt;Please post your data as two working SAS data steps creating this data and not as screenshots.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;A hash key lookup works only over an exact match so it's not suitable for your use case.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If your problem is only the potentially leading page numbers (digits, dot blank) at the beginning of a string then just clean up the string. A simple RegEx can do this job.&lt;/P&gt;</description>
      <pubDate>Wed, 07 Mar 2018 12:36:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Hash-find-vs-datastep-find/m-p/443266#M110880</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2018-03-07T12:36:21Z</dc:date>
    </item>
    <item>
      <title>Re: Hash find vs datastep find</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Hash-find-vs-datastep-find/m-p/443267#M110881</link>
      <description>&lt;P&gt;Please post your example datasets as data steps with datalines. NEVER post pictures of data, we've got better things to do than typing things off screenshots.&lt;/P&gt;</description>
      <pubDate>Wed, 07 Mar 2018 12:36:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Hash-find-vs-datastep-find/m-p/443267#M110881</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2018-03-07T12:36:30Z</dc:date>
    </item>
    <item>
      <title>Re: Hash find vs datastep find</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Hash-find-vs-datastep-find/m-p/443278#M110884</link>
      <description>&lt;P&gt;"&lt;SPAN&gt;&amp;nbsp;I can not use Perl regular expression since variable acrf is unstructured." - if its unstructured then nothing will work on it, there needs to be logical methodology to find one string in another.&amp;nbsp; You could generate the code from one to the other as another option (still needs the logic though);&lt;/SPAN&gt;&lt;/P&gt;
&lt;PRE&gt;data _null_;
  set gdb end=last;
  if _n_=1 then call execute('data want;  set have;');
  call execute(cats('if index(acrf,"'question,'") then found=1;'));
  if last then call execute('run;');
run;&lt;/PRE&gt;
&lt;P&gt;&lt;SPAN&gt;This will create a datastep with one if for each row in gdb.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 07 Mar 2018 13:06:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Hash-find-vs-datastep-find/m-p/443278#M110884</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2018-03-07T13:06:25Z</dc:date>
    </item>
    <item>
      <title>Re: Hash find vs datastep find</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Hash-find-vs-datastep-find/m-p/443298#M110890</link>
      <description>&lt;P&gt;Just to demonstrate how the clean-up could look like:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data demo;
  have_str='3. At what time did the adverse event start';
  want_str=prxchange('s/(^\d+\.\s)(.*)/\2/o',1,strip(have_str));
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;For anything else/more code: Post your data in the form of working data steps as I'm like others not going to do this work for you.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 07 Mar 2018 13:43:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Hash-find-vs-datastep-find/m-p/443298#M110890</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2018-03-07T13:43:53Z</dc:date>
    </item>
    <item>
      <title>Re: Hash find vs datastep find</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Hash-find-vs-datastep-find/m-p/443641#M111011</link>
      <description>Thank you Patrick I have attached dataset</description>
      <pubDate>Thu, 08 Mar 2018 06:12:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Hash-find-vs-datastep-find/m-p/443641#M111011</guid>
      <dc:creator>Rajaram</dc:creator>
      <dc:date>2018-03-08T06:12:01Z</dc:date>
    </item>
    <item>
      <title>Re: Hash find vs datastep find</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Hash-find-vs-datastep-find/m-p/443642#M111012</link>
      <description>Thank you. I have attached dataset</description>
      <pubDate>Thu, 08 Mar 2018 06:12:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Hash-find-vs-datastep-find/m-p/443642#M111012</guid>
      <dc:creator>Rajaram</dc:creator>
      <dc:date>2018-03-08T06:12:49Z</dc:date>
    </item>
    <item>
      <title>Re: Hash find vs datastep find</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Hash-find-vs-datastep-find/m-p/443644#M111014</link>
      <description>Thank you RW9. It is working but taking longer time to execute. around 10min. Since dataset contains larger number of observations.</description>
      <pubDate>Thu, 08 Mar 2018 06:21:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Hash-find-vs-datastep-find/m-p/443644#M111014</guid>
      <dc:creator>Rajaram</dc:creator>
      <dc:date>2018-03-08T06:21:48Z</dc:date>
    </item>
    <item>
      <title>Re: Hash find vs datastep find</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Hash-find-vs-datastep-find/m-p/443658#M111017</link>
      <description>&lt;P&gt;Posting .sas7bdat files may often not be helpful, as encodings or byte order may prevemt users from using them. That's why we STRONGLY recommend to post example data in a data step with datalines, as code can be simply copy/pasted and submitted on any environment.&lt;/P&gt;
&lt;P&gt;A macro to do that conversion automatically can be found here: &lt;A href="https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-data-AKA-generate/ta-p/258712" target="_blank"&gt;https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-data-AKA-generate/ta-p/258712&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 08 Mar 2018 07:18:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Hash-find-vs-datastep-find/m-p/443658#M111017</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2018-03-08T07:18:49Z</dc:date>
    </item>
    <item>
      <title>Re: Hash find vs datastep find</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Hash-find-vs-datastep-find/m-p/443875#M111089</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/34124"&gt;@Rajaram&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;Something is missing in your data: In ds ACRFTXT there are only blanks for column ACRF.&lt;/P&gt;</description>
      <pubDate>Thu, 08 Mar 2018 19:58:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Hash-find-vs-datastep-find/m-p/443875#M111089</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2018-03-08T19:58:04Z</dc:date>
    </item>
    <item>
      <title>Re: Hash find vs datastep find</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Hash-find-vs-datastep-find/m-p/444031#M111124</link>
      <description>&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/11562"&gt;@Kurt_Bremser&lt;/a&gt;&lt;BR /&gt;&lt;BR /&gt;I have used that macro and added SAS code</description>
      <pubDate>Fri, 09 Mar 2018 08:49:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Hash-find-vs-datastep-find/m-p/444031#M111124</guid>
      <dc:creator>Rajaram</dc:creator>
      <dc:date>2018-03-09T08:49:09Z</dc:date>
    </item>
    <item>
      <title>Re: Hash find vs datastep find</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Hash-find-vs-datastep-find/m-p/444032#M111125</link>
      <description>&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/12447"&gt;@Patrick&lt;/a&gt;&lt;BR /&gt;&lt;BR /&gt;Yes you are correct, It has blank values in the column. I will give you background. aCRF it is a text from directly coming from PDF (SaveAs Other/PDFTOTEXT). When I saved from PDF all the formatting went because of text file nature. In SAS I can not import PDF files directly so i have used text format to search text and pages from PDF.</description>
      <pubDate>Fri, 09 Mar 2018 08:55:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Hash-find-vs-datastep-find/m-p/444032#M111125</guid>
      <dc:creator>Rajaram</dc:creator>
      <dc:date>2018-03-09T08:55:42Z</dc:date>
    </item>
    <item>
      <title>Re: Hash find vs datastep find</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Hash-find-vs-datastep-find/m-p/444034#M111126</link>
      <description>&lt;P&gt;Why are you reading in the aCRF pDF in the first place.&amp;nbsp; This is a specification document for the database, so just extract the metadata directly from the database.&amp;nbsp; Reading in a PDF is going to be a lot of work, worthless duplication of what is already around, and not robust - how will you handle changes etc.&lt;/P&gt;</description>
      <pubDate>Fri, 09 Mar 2018 09:11:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Hash-find-vs-datastep-find/m-p/444034#M111126</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2018-03-09T09:11:30Z</dc:date>
    </item>
    <item>
      <title>Re: Hash find vs datastep find</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Hash-find-vs-datastep-find/m-p/444042#M111127</link>
      <description>&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/45151"&gt;@RW9&lt;/a&gt;&lt;BR /&gt;Thanks for replying, I am in process of automating aCRF for SDTM submission. I am following the paper &lt;A href="https://www.pharmasug.org/proceedings/2015/AD/PharmaSUG-2015-AD07.pdf" target="_blank"&gt;https://www.pharmasug.org/proceedings/2015/AD/PharmaSUG-2015-AD07.pdf&lt;/A&gt;</description>
      <pubDate>Fri, 09 Mar 2018 10:19:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Hash-find-vs-datastep-find/m-p/444042#M111127</guid>
      <dc:creator>Rajaram</dc:creator>
      <dc:date>2018-03-09T10:19:06Z</dc:date>
    </item>
    <item>
      <title>Re: Hash find vs datastep find</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Hash-find-vs-datastep-find/m-p/444046#M111128</link>
      <description>&lt;P&gt;Well, I don't know that paper, however the process itself sounds backwards.&amp;nbsp; PDF's are outputs, they are only used for people to look at, they are not conducive to any other process.&amp;nbsp; Most databases (Oracle, Medidata Rave etc.) have modules designed for standard CRF builds.&amp;nbsp; These are accessible for Data Management staff - who are responsible for this part - and other users.&amp;nbsp; It is down to the DM group to create standardised CRF libraries, then use these to implement database builds.&amp;nbsp; As a programmer, you can simply extract this metadata directly from the database.&amp;nbsp; This is a preferred method as then all the information done and entered in one place (hence one of the main reasons we use databases in the first place), it is stored&amp;nbsp;in&amp;nbsp;a usable format, and provides the option to extract as raw data or produce reports.&amp;nbsp; Doing this process the other way, getting an output, then reading that in and processing loses all of this - i.e. if anything changes you need to start again by getting the output and processing it.&lt;/P&gt;</description>
      <pubDate>Fri, 09 Mar 2018 10:43:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Hash-find-vs-datastep-find/m-p/444046#M111128</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2018-03-09T10:43:16Z</dc:date>
    </item>
  </channel>
</rss>

