<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Replicate PROC SQL for 200+categories - format library in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Replicate-PROC-SQL-for-200-categories-format-library/m-p/504869#M135166</link>
    <description>&lt;P&gt;Hi guys,&lt;/P&gt;&lt;P&gt;I am working on a simple PROC SQL command that creates distinct patient level data for each category. I have to do this step because the next macro is looking for a unique patient level data as one of its inputs.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My end goal:&lt;/P&gt;&lt;P&gt;Produce means of "amount paid" for each category.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This is how the original data (sourcedata) looks like:&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;Patient&lt;/TD&gt;&lt;TD&gt;Claims_ID&lt;/TD&gt;&lt;TD&gt;Description&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;12&lt;/TD&gt;&lt;TD&gt;A&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;13&lt;/TD&gt;&lt;TD&gt;A&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;15&lt;/TD&gt;&lt;TD&gt;Y&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;TD&gt;16&lt;/TD&gt;&lt;TD&gt;D&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;TD&gt;19&lt;/TD&gt;&lt;TD&gt;B&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;TD&gt;26&lt;/TD&gt;&lt;TD&gt;D&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Issue here is:&lt;/P&gt;&lt;P&gt;I have to adjust with an existing macro that takes only unique patient level data - to create "amount paid" at patient level..So I have to first create individual category data sets (unique at patient level) from the master data. After this step, I can pass them onto the macro.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My approach:&lt;/P&gt;&lt;P&gt;%let cat = Diagnosis1 &lt;SPAN&gt;Diagnosis&lt;/SPAN&gt;2 &lt;SPAN&gt;Diagnosis&lt;/SPAN&gt;3.... &lt;SPAN&gt;Diagnosis&lt;/SPAN&gt;200;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;/* macro below simply creates distinct patient level data for each category - this is used to identify patients in that category */&lt;/P&gt;&lt;P&gt;%macro create1();&lt;/P&gt;&lt;P&gt;%do i = 1 %to &amp;amp;countcat.;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;/*&lt;SPAN&gt;countcat contains number of categories say 200 */&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;%let eachCat = %scan(&amp;amp;cat, &amp;amp;i);&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; /* cat is a mcro variable containing each of the 200 category names separated by space */&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;PROC SQL;&lt;/P&gt;&lt;P&gt;Create table uniquePatient_&amp;amp;eachCat.&amp;nbsp; /* creates dataset for each category */&lt;/P&gt;&lt;P&gt;select distinct patientID FROM&amp;nbsp;&lt;/P&gt;&lt;P&gt;Sourcedata&lt;/P&gt;&lt;P&gt;where prxmacth("m,&amp;amp;eachCat./oi", category) &amp;gt; 0; /* this filters the data to that category */&lt;/P&gt;&lt;P&gt;quit;&lt;/P&gt;&lt;P&gt;%end;&lt;/P&gt;&lt;P&gt;%mend();&lt;/P&gt;&lt;P&gt;%create1(); /* invoke the macro */&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Sourcedata has 50million observations which are at patientID and claims ID level.&lt;/P&gt;&lt;P&gt;This simple macro is taking 3.5 minutes to create dataset for each category so basically it will take 10 hours to complete this run!&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I need distinct patient ID data that will be entered into another macro otherwise I would have thought of anything other than creating so many datasets as done above. I have heard that you can create format libraries that can speed up this process..any idea on how to go about it and if it may be&amp;nbsp;applicable in this scenario - will be greatly appreciated!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regards&lt;/P&gt;</description>
    <pubDate>Tue, 16 Oct 2018 20:23:21 GMT</pubDate>
    <dc:creator>rosegarden81</dc:creator>
    <dc:date>2018-10-16T20:23:21Z</dc:date>
    <item>
      <title>Replicate PROC SQL for 200+categories - format library</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Replicate-PROC-SQL-for-200-categories-format-library/m-p/504869#M135166</link>
      <description>&lt;P&gt;Hi guys,&lt;/P&gt;&lt;P&gt;I am working on a simple PROC SQL command that creates distinct patient level data for each category. I have to do this step because the next macro is looking for a unique patient level data as one of its inputs.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My end goal:&lt;/P&gt;&lt;P&gt;Produce means of "amount paid" for each category.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This is how the original data (sourcedata) looks like:&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;Patient&lt;/TD&gt;&lt;TD&gt;Claims_ID&lt;/TD&gt;&lt;TD&gt;Description&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;12&lt;/TD&gt;&lt;TD&gt;A&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;13&lt;/TD&gt;&lt;TD&gt;A&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;15&lt;/TD&gt;&lt;TD&gt;Y&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;TD&gt;16&lt;/TD&gt;&lt;TD&gt;D&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;TD&gt;19&lt;/TD&gt;&lt;TD&gt;B&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;TD&gt;26&lt;/TD&gt;&lt;TD&gt;D&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Issue here is:&lt;/P&gt;&lt;P&gt;I have to adjust with an existing macro that takes only unique patient level data - to create "amount paid" at patient level..So I have to first create individual category data sets (unique at patient level) from the master data. After this step, I can pass them onto the macro.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My approach:&lt;/P&gt;&lt;P&gt;%let cat = Diagnosis1 &lt;SPAN&gt;Diagnosis&lt;/SPAN&gt;2 &lt;SPAN&gt;Diagnosis&lt;/SPAN&gt;3.... &lt;SPAN&gt;Diagnosis&lt;/SPAN&gt;200;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;/* macro below simply creates distinct patient level data for each category - this is used to identify patients in that category */&lt;/P&gt;&lt;P&gt;%macro create1();&lt;/P&gt;&lt;P&gt;%do i = 1 %to &amp;amp;countcat.;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;/*&lt;SPAN&gt;countcat contains number of categories say 200 */&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;%let eachCat = %scan(&amp;amp;cat, &amp;amp;i);&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; /* cat is a mcro variable containing each of the 200 category names separated by space */&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;PROC SQL;&lt;/P&gt;&lt;P&gt;Create table uniquePatient_&amp;amp;eachCat.&amp;nbsp; /* creates dataset for each category */&lt;/P&gt;&lt;P&gt;select distinct patientID FROM&amp;nbsp;&lt;/P&gt;&lt;P&gt;Sourcedata&lt;/P&gt;&lt;P&gt;where prxmacth("m,&amp;amp;eachCat./oi", category) &amp;gt; 0; /* this filters the data to that category */&lt;/P&gt;&lt;P&gt;quit;&lt;/P&gt;&lt;P&gt;%end;&lt;/P&gt;&lt;P&gt;%mend();&lt;/P&gt;&lt;P&gt;%create1(); /* invoke the macro */&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Sourcedata has 50million observations which are at patientID and claims ID level.&lt;/P&gt;&lt;P&gt;This simple macro is taking 3.5 minutes to create dataset for each category so basically it will take 10 hours to complete this run!&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I need distinct patient ID data that will be entered into another macro otherwise I would have thought of anything other than creating so many datasets as done above. I have heard that you can create format libraries that can speed up this process..any idea on how to go about it and if it may be&amp;nbsp;applicable in this scenario - will be greatly appreciated!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regards&lt;/P&gt;</description>
      <pubDate>Tue, 16 Oct 2018 20:23:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Replicate-PROC-SQL-for-200-categories-format-library/m-p/504869#M135166</guid>
      <dc:creator>rosegarden81</dc:creator>
      <dc:date>2018-10-16T20:23:21Z</dc:date>
    </item>
    <item>
      <title>Re: Replicate PROC SQL for 200+categories - format library</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Replicate-PROC-SQL-for-200-categories-format-library/m-p/504878#M135170</link>
      <description>You can do rsubmit which can process all 200 tables at same time which reduces time.&lt;BR /&gt;To do rsubmit you need to follow the below link&lt;BR /&gt;&lt;BR /&gt;&lt;A href="https://communities.sas.com/t5/SAS-Programming/Base-sas-connect-to-remote-windows-server/m-p/420967#M103576" target="_blank"&gt;https://communities.sas.com/t5/SAS-Programming/Base-sas-connect-to-remote-windows-server/m-p/420967#M103576&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;once done you can use rsubmit for parallel processing</description>
      <pubDate>Tue, 16 Oct 2018 20:40:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Replicate-PROC-SQL-for-200-categories-format-library/m-p/504878#M135170</guid>
      <dc:creator>sidpesar</dc:creator>
      <dc:date>2018-10-16T20:40:45Z</dc:date>
    </item>
    <item>
      <title>Re: Replicate PROC SQL for 200+categories - format library</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Replicate-PROC-SQL-for-200-categories-format-library/m-p/504894#M135177</link>
      <description>I work on a remote desktop which does not have internet access! Hence, I wouldnt be able to use an r submit I guess..</description>
      <pubDate>Tue, 16 Oct 2018 21:02:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Replicate-PROC-SQL-for-200-categories-format-library/m-p/504894#M135177</guid>
      <dc:creator>rosegarden81</dc:creator>
      <dc:date>2018-10-16T21:02:45Z</dc:date>
    </item>
    <item>
      <title>Re: Replicate PROC SQL for 200+categories - format library</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Replicate-PROC-SQL-for-200-categories-format-library/m-p/504896#M135178</link>
      <description>you can rusbmit on your own machine or remote machine. when you run on your machine sas opens instances and run your job(s) parallel and then we can send data to your current sas work folder</description>
      <pubDate>Tue, 16 Oct 2018 21:07:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Replicate-PROC-SQL-for-200-categories-format-library/m-p/504896#M135178</guid>
      <dc:creator>sidpesar</dc:creator>
      <dc:date>2018-10-16T21:07:40Z</dc:date>
    </item>
  </channel>
</rss>

