<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Perform logic in one dataset based on a condition in a separate dataset in New SAS User</title>
    <link>https://communities.sas.com/t5/New-SAS-User/Perform-logic-in-one-dataset-based-on-a-condition-in-a-separate/m-p/534972#M6396</link>
    <description>&lt;P&gt;How large is 'extremely large'?&lt;/P&gt;</description>
    <pubDate>Tue, 12 Feb 2019 19:35:40 GMT</pubDate>
    <dc:creator>PeterClemmensen</dc:creator>
    <dc:date>2019-02-12T19:35:40Z</dc:date>
    <item>
      <title>Perform logic in one dataset based on a condition in a separate dataset</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Perform-logic-in-one-dataset-based-on-a-condition-in-a-separate/m-p/534971#M6395</link>
      <description>&lt;P&gt;Hi,&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have an extremely large dataset that crashes my SAS for being too large if I try to perform any commands on it (e.g., proc sort or merge). I only need to work with a small subset of that data; however, the subset I need to work on is identified by a list of id's that is currently in a separate dataset ("smallDataset").&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;Is there any way to do something to the effect of:&lt;/P&gt;&lt;PRE&gt;data x;
&amp;nbsp; &amp;nbsp;set largeDataset;
&amp;nbsp; &amp;nbsp;where id in smallDataset and largeDataset;
run;&lt;/PRE&gt;&lt;P&gt;I could then merge the large and small datasets no problem.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any other suggestions of how to accomplish this would be great.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Current code:&lt;/P&gt;&lt;PRE&gt;data largeDataset; set '/folders/myfolders/largeDataset.sas7bdat'; run;
proc sort data=largeDataset; by caseid; run;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;**Crashes here^^^&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sort data=smallDataset;
by caseid;
run;


data merged;
merge smallDataset largeDataset;
by caseid;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 12 Feb 2019 19:34:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Perform-logic-in-one-dataset-based-on-a-condition-in-a-separate/m-p/534971#M6395</guid>
      <dc:creator>thanksforhelp12</dc:creator>
      <dc:date>2019-02-12T19:34:25Z</dc:date>
    </item>
    <item>
      <title>Re: Perform logic in one dataset based on a condition in a separate dataset</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Perform-logic-in-one-dataset-based-on-a-condition-in-a-separate/m-p/534972#M6396</link>
      <description>&lt;P&gt;How large is 'extremely large'?&lt;/P&gt;</description>
      <pubDate>Tue, 12 Feb 2019 19:35:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Perform-logic-in-one-dataset-based-on-a-condition-in-a-separate/m-p/534972#M6396</guid>
      <dc:creator>PeterClemmensen</dc:creator>
      <dc:date>2019-02-12T19:35:40Z</dc:date>
    </item>
    <item>
      <title>Re: Perform logic in one dataset based on a condition in a separate dataset</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Perform-logic-in-one-dataset-based-on-a-condition-in-a-separate/m-p/534973#M6397</link>
      <description>&lt;P&gt;It's around 10gb (1 million by 300 - I know some people say that isn't that big but it has been crashing every time I try to do anything with it - got it working by using a custom user folder and selecting a subset of data (with where) before I ever load it in with a "data" command, which has worked well until I am now faced with the need to somehow select the subset based on another .sas7bat. Here is the error it throws when I try to sort (there is 300gb of HDD space free).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;&lt;DIV class="sasError"&gt;ERROR: No disk space is available for the write operation. Filename =&lt;/DIV&gt;&lt;DIV class="sasError"&gt;/tmp/SAS_util000100000A6F_localhost.localdomain/ut0A6F000005.utl.&lt;/DIV&gt;&lt;DIV class="sasError"&gt;ERROR: Failure while attempting to write page 3728 of sorted run 6.&lt;/DIV&gt;&lt;DIV class="sasError"&gt;ERROR: Failure while attempting to write page 44778 to utility file 1.&lt;/DIV&gt;&lt;DIV class="sasError"&gt;ERROR: Failure encountered while creating initial set of sorted runs.&lt;/DIV&gt;&lt;DIV class="sasError"&gt;ERROR: Failure encountered during external sort.&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;DIV class="sasError"&gt;ERROR: Sort execution failure.&lt;/DIV&gt;&lt;/DIV&gt;&lt;PRE class="sasLog"&gt;&amp;nbsp;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 12 Feb 2019 19:40:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Perform-logic-in-one-dataset-based-on-a-condition-in-a-separate/m-p/534973#M6397</guid>
      <dc:creator>thanksforhelp12</dc:creator>
      <dc:date>2019-02-12T19:40:59Z</dc:date>
    </item>
    <item>
      <title>Re: Perform logic in one dataset based on a condition in a separate dataset</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Perform-logic-in-one-dataset-based-on-a-condition-in-a-separate/m-p/534974#M6398</link>
      <description>&lt;P&gt;Ok. What about your small dataset?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You can write your subsetting code like this without sorting&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data merged;
   if _N_ = 1 then do;
      declare hash h(dataset:'smallDataset');
      h.defineKey('caseid');
      h.defineDone();
   end;

   set largeDataset;

   if h.check();
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 12 Feb 2019 19:43:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Perform-logic-in-one-dataset-based-on-a-condition-in-a-separate/m-p/534974#M6398</guid>
      <dc:creator>PeterClemmensen</dc:creator>
      <dc:date>2019-02-12T19:43:27Z</dc:date>
    </item>
    <item>
      <title>Re: Perform logic in one dataset based on a condition in a separate dataset</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Perform-logic-in-one-dataset-based-on-a-condition-in-a-separate/m-p/534979#M6399</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/256012"&gt;@thanksforhelp12&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;It's around 10gb (1 million by 300 - I know some people say that isn't that big but it has been crashing every time I try to do anything with it - got it working by using a custom user folder and selecting a subset of data (with where) before I ever load it in with a "data" command, which has worked well until I am now faced with the need to somehow select the subset based on another .sas7bat. Here is the error it throws when I try to sort (there is 300gb of HDD space free).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;DIV&gt;
&lt;DIV class="sasError"&gt;ERROR: &lt;STRONG&gt;&lt;FONT color="#0000ff"&gt;No disk space is available&lt;/FONT&gt; &lt;/STRONG&gt;for the write operation. Filename =&lt;/DIV&gt;
&lt;DIV class="sasError"&gt;/tmp/SAS_util000100000A6F_localhost.localdomain/ut0A6F000005.utl.&lt;/DIV&gt;
&lt;DIV class="sasError"&gt;ERROR: Failure while attempting to write page 3728 of sorted run 6.&lt;/DIV&gt;
&lt;DIV class="sasError"&gt;ERROR: Failure while attempting to write page 44778 to utility file 1.&lt;/DIV&gt;
&lt;DIV class="sasError"&gt;ERROR: Failure encountered while creating initial set of sorted runs.&lt;/DIV&gt;
&lt;DIV class="sasError"&gt;ERROR: Failure encountered during external sort.&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV&gt;
&lt;DIV class="sasError"&gt;ERROR: Sort execution failure.&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;PRE class="sasLog"&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;If you are working in a networked environment such as Enterprise Guide perhaps your SAS admin has restricted the amount of disk space you are allowed to use.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you are&amp;nbsp; using Proc Sort then perhaps you want to try using the TAGSORT option. Only the sort keys and an observation number are used. Which means most of the 300 variables are ignored during the mechanics of the sort and then the values are brought in the sorted result of the key variables.&lt;/P&gt;</description>
      <pubDate>Tue, 12 Feb 2019 20:08:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Perform-logic-in-one-dataset-based-on-a-condition-in-a-separate/m-p/534979#M6399</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2019-02-12T20:08:56Z</dc:date>
    </item>
    <item>
      <title>Re: Perform logic in one dataset based on a condition in a separate dataset</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Perform-logic-in-one-dataset-based-on-a-condition-in-a-separate/m-p/534981#M6400</link>
      <description>&lt;P&gt;Small dataset is small. Like 20k rows by 19 col. Thanks a lot for the subset code.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;It executed successfully; however, it returned a large dataset without the new columns from smallDataset added&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 12 Feb 2019 20:11:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Perform-logic-in-one-dataset-based-on-a-condition-in-a-separate/m-p/534981#M6400</guid>
      <dc:creator>thanksforhelp12</dc:creator>
      <dc:date>2019-02-12T20:11:50Z</dc:date>
    </item>
    <item>
      <title>Re: Perform logic in one dataset based on a condition in a separate dataset</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Perform-logic-in-one-dataset-based-on-a-condition-in-a-separate/m-p/534989#M6402</link>
      <description>&lt;P&gt;What new columns? Please be more specific &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 12 Feb 2019 20:31:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Perform-logic-in-one-dataset-based-on-a-condition-in-a-separate/m-p/534989#M6402</guid>
      <dc:creator>PeterClemmensen</dc:creator>
      <dc:date>2019-02-12T20:31:05Z</dc:date>
    </item>
    <item>
      <title>Re: Perform logic in one dataset based on a condition in a separate dataset</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Perform-logic-in-one-dataset-based-on-a-condition-in-a-separate/m-p/534991#M6403</link>
      <description>&lt;P&gt;Large dataset has information on 300 variables for 2 million cases.&lt;/P&gt;&lt;P&gt;Small dataset has an additional 20 variables for 20,000 of those 2 million cases.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I would like to perform analyses on those 20,000 cases including both the original 300 and additional 20 variables (columns).&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;Therefore, my goal is to merge smallDataset with the subset of cases from largeDataset whose caseid is also in smallDataset (i.e., to end up with 20,000 cases (rows) of 320 variables).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The code above did run; however, it just returned the original 2 million rows by 300 columns.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 12 Feb 2019 20:39:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Perform-logic-in-one-dataset-based-on-a-condition-in-a-separate/m-p/534991#M6403</guid>
      <dc:creator>thanksforhelp12</dc:creator>
      <dc:date>2019-02-12T20:39:09Z</dc:date>
    </item>
    <item>
      <title>Re: Perform logic in one dataset based on a condition in a separate dataset</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Perform-logic-in-one-dataset-based-on-a-condition-in-a-separate/m-p/534992#M6404</link>
      <description>&lt;P&gt;Yeah, it's likely too large for SAS University Edition which it appears you're using. If you can use a full version you won't run into these issues. You also increased the RAM settings and dual core I assume?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You can get a subset by using the following code.&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
create table sub as
select *
from bigData
where ID in (select Id from smallTable);
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;This is one case where it may make sense to make several subsets and loop through them to get what you need, if possible.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/256012"&gt;@thanksforhelp12&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Large dataset has information on 300 variables for 2 million cases.&lt;/P&gt;
&lt;P&gt;Small dataset has an additional 20 variables for 20,000 of those 2 million cases.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I would like to perform analyses on those 20,000 cases including both the original 300 and additional 20 variables (columns).&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;Therefore, my goal is to merge smallDataset with the subset of cases from largeDataset whose caseid is also in smallDataset (i.e., to end up with 20,000 cases (rows) of 320 variables).&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 12 Feb 2019 20:39:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Perform-logic-in-one-dataset-based-on-a-condition-in-a-separate/m-p/534992#M6404</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2019-02-12T20:39:33Z</dc:date>
    </item>
    <item>
      <title>Re: Perform logic in one dataset based on a condition in a separate dataset</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Perform-logic-in-one-dataset-based-on-a-condition-in-a-separate/m-p/534995#M6405</link>
      <description>&lt;P&gt;Ah ok. Do like this then and insert your variables from smallDataset instead of Var1, Var2, var3 in the call missing statement&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data merged;
   if 0 then set smallDataset;

   if _N_ = 1 then do;
      declare hash h(dataset:'smallDataset');
      h.defineKey('caseid');
      h.definedata(all:'Y');
      h.defineDone();

      /* Insert your actual variables from smallDataset instead of Var1, Var2, Var3... */
      call missing('var1', 'var2', 'var3');
   end;

   set largeDataset;

   if h.find();
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 12 Feb 2019 20:43:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Perform-logic-in-one-dataset-based-on-a-condition-in-a-separate/m-p/534995#M6405</guid>
      <dc:creator>PeterClemmensen</dc:creator>
      <dc:date>2019-02-12T20:43:20Z</dc:date>
    </item>
    <item>
      <title>Re: Perform logic in one dataset based on a condition in a separate dataset</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Perform-logic-in-one-dataset-based-on-a-condition-in-a-separate/m-p/535003#M6407</link>
      <description>This is exactly what I was looking to do and worked perfectly.&lt;BR /&gt;&lt;BR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/31304"&gt;@PeterClemmensen&lt;/a&gt;, thank you so much for your help as well, I really appreciate it, and I am sure your new solution would have worked, too. As a newbie around here, I do not how to select which of you to accept as solution. Does it affect you all in any way? &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt;, thanks again this is exactly what I was looking to do in the original code.</description>
      <pubDate>Tue, 12 Feb 2019 20:53:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Perform-logic-in-one-dataset-based-on-a-condition-in-a-separate/m-p/535003#M6407</guid>
      <dc:creator>thanksforhelp12</dc:creator>
      <dc:date>2019-02-12T20:53:45Z</dc:date>
    </item>
    <item>
      <title>Re: Perform logic in one dataset based on a condition in a separate dataset</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Perform-logic-in-one-dataset-based-on-a-condition-in-a-separate/m-p/535006#M6408</link>
      <description>&lt;P&gt;I'm glad you found your answer.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt;s answer worked for you, then her answer is the one to mark as a solution &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 12 Feb 2019 20:57:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Perform-logic-in-one-dataset-based-on-a-condition-in-a-separate/m-p/535006#M6408</guid>
      <dc:creator>PeterClemmensen</dc:creator>
      <dc:date>2019-02-12T20:57:44Z</dc:date>
    </item>
  </channel>
</rss>

