<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Comparing Input/Output Datasets in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/Comparing-Input-Output-Datasets/m-p/35477#M8751</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;What happens between initial input and final output.&amp;nbsp; Obviously, you have to keep the fields that are on the final output's keep statement, but what other processes/datasteps/sql code/procs are used during the process?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Such tasks can either be trivial or extremely difficult.&amp;nbsp; Depends upon your code.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Mon, 23 Jan 2012 04:20:37 GMT</pubDate>
    <dc:creator>art297</dc:creator>
    <dc:date>2012-01-23T04:20:37Z</dc:date>
    <item>
      <title>Comparing Input/Output Datasets</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Comparing-Input-Output-Datasets/m-p/35476#M8750</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I have a SAS program (multiple macros deep) that has:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;1. Several input datasets with several thousands of fields, all being read in.&lt;/P&gt;&lt;P&gt;2. Few output flat files with hundreds of fields&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The issue is that the intermediate datasets are so large it is causing space issues and the performance is really bad. I want to identify only those fields on the input side that are essential, and use a keep statement.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Besides manual tracing, is there any tool or method that I should use? We have sas (r) 9.1.3 on aix 5.3&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I appreciate any guidance &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Biplob&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 23 Jan 2012 04:14:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Comparing-Input-Output-Datasets/m-p/35476#M8750</guid>
      <dc:creator>Biplob94</dc:creator>
      <dc:date>2012-01-23T04:14:45Z</dc:date>
    </item>
    <item>
      <title>Comparing Input/Output Datasets</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Comparing-Input-Output-Datasets/m-p/35477#M8751</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;What happens between initial input and final output.&amp;nbsp; Obviously, you have to keep the fields that are on the final output's keep statement, but what other processes/datasteps/sql code/procs are used during the process?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Such tasks can either be trivial or extremely difficult.&amp;nbsp; Depends upon your code.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 23 Jan 2012 04:20:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Comparing-Input-Output-Datasets/m-p/35477#M8751</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2012-01-23T04:20:37Z</dc:date>
    </item>
    <item>
      <title>Comparing Input/Output Datasets</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Comparing-Input-Output-Datasets/m-p/35478#M8752</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;The below code creates TABLE_A with the records that produces rows that are part of the first query only.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;PROC SQL;&lt;/P&gt;&lt;P&gt;CREATE TABLE TABLE_A AS &lt;/P&gt;&lt;P&gt;SELECT * FROM TABLE1&lt;/P&gt;&lt;P&gt;EXCEPT &lt;/P&gt;&lt;P&gt;SELECT * FROM TABLE 2;&lt;/P&gt;&lt;P&gt;QUIT;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 02 Feb 2012 15:30:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Comparing-Input-Output-Datasets/m-p/35478#M8752</guid>
      <dc:creator>Hima</dc:creator>
      <dc:date>2012-02-02T15:30:35Z</dc:date>
    </item>
    <item>
      <title>Comparing Input/Output Datasets</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Comparing-Input-Output-Datasets/m-p/35479#M8753</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Biplob,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The first thing to consider is whether you want to improve disk space usage, performance, or both.&amp;nbsp; I suspect it's "both", since macros tend to be used repeatedly.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The disk space issue can probably be solved by creating views instead of data sets.&amp;nbsp; Both data steps and SQL can create views, and it is possible to use a view as input and another view as output of the same step.&amp;nbsp; Without seeing the code (and I'm not really asking for that), it's hard to be more specific.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Performance will improve by adding KEEP=, but there is no shortcut.&amp;nbsp; You have to manually work through the code to figure out what variables are needed when.&amp;nbsp; You could start with the variables in the final output, but other variables might be needed as the program begins.&amp;nbsp; For example, variables might be used to subset observations, or to make calculations, but might not be needed after that point.&amp;nbsp; The only tool that can figure this out is the human brain.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;One style that I like is to add to the outermost macro a set of %LET statements:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;%let keeplist1 = a long list of variables;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;%let keeplist2 = a different list;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Then refer to &amp;amp;KEEPLIST1 and &amp;amp;KEEPLIST2 in later code.&amp;nbsp; This makes the programming easier to read, update, and debug.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Note that there is a difference between KEEP= on the SET statement and KEEP= on the DATA statement.&amp;nbsp; The first limits what you read in, and the second limits what you save.&amp;nbsp; You might find steps that use KEEP= on both, with different sets of variables.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;To test your results, I suggest you run on either small data sets or on just one input data set using your current set of macros.&amp;nbsp; Save the result (preferably as a SAS data set).&amp;nbsp; Then after modifying the code, use PROC COMPARE to see if the new output differs from the old.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Good luck.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 02 Feb 2012 17:02:36 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Comparing-Input-Output-Datasets/m-p/35479#M8753</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2012-02-02T17:02:36Z</dc:date>
    </item>
    <item>
      <title>Comparing Input/Output Datasets</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Comparing-Input-Output-Datasets/m-p/35480#M8754</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt; Thanks to everyone who responded. I now know what to do.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;This is the first time I used a forum for help and am very impressed by the depth and promptness of the answers.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 08 Feb 2012 03:38:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Comparing-Input-Output-Datasets/m-p/35480#M8754</guid>
      <dc:creator>Biplob94</dc:creator>
      <dc:date>2012-02-08T03:38:39Z</dc:date>
    </item>
  </channel>
</rss>

