<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Reduce run time in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Reduce-run-time/m-p/841652#M332808</link>
    <description>May you please show code?</description>
    <pubDate>Mon, 31 Oct 2022 14:12:07 GMT</pubDate>
    <dc:creator>Ronein</dc:creator>
    <dc:date>2022-10-31T14:12:07Z</dc:date>
    <item>
      <title>Reduce run time</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reduce-run-time/m-p/841619#M332788</link>
      <description>&lt;P&gt;Hello&lt;/P&gt;
&lt;P&gt;I want to create a new data set&amp;nbsp; from existing data set.&lt;/P&gt;
&lt;P&gt;The existing data set is very big and contains 100 million rows.&lt;/P&gt;
&lt;P&gt;It is taking long long time to run proc sort+using where conditions ,&lt;/P&gt;
&lt;P&gt;What do you think is better way to run ? Way1 or way2 ? or maybe there is another way that can help to reduce run time?&lt;/P&gt;
&lt;P&gt;Sorry ta=hat I cannot send the data&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;/****Way1*****/
%let YYMM=2209;

proc sort data=ttt(Where=(UPDATE_DATE  ne '16MAY2022'd  AND input(put(UPDATE_DATE,yymmn4.),4.)=&amp;amp;YYMM.))  out=want;
by   UPDATE_DATE    FK_A       FK_B    REFERENCE_DATE  ;
Run;


/****Way2*****/
%let YYMM=2209;

Data temp;
SET ttt(Where=(UPDATE_DATE  ne '16MAY2022'd  AND input(put(UPDATE_DATE,yymmn4.),4.)=&amp;amp;YYMM.));
Run;

proc sort data=temp out=want;
by   UPDATE_DATE    FK_A       FK_B    REFERENCE_DATE  ;
Run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 31 Oct 2022 12:23:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reduce-run-time/m-p/841619#M332788</guid>
      <dc:creator>Ronein</dc:creator>
      <dc:date>2022-10-31T12:23:00Z</dc:date>
    </item>
    <item>
      <title>Re: Reduce run time</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reduce-run-time/m-p/841620#M332789</link>
      <description>Do NOT use PUT() and INPUT() function in your code , Try other function Like :&lt;BR /&gt;&lt;BR /&gt;UPDATE_DATE  ne '16MAY2022'd  AND input(put(UPDATE_DATE,yymmn4.),4.)=&amp;amp;YYMM.&lt;BR /&gt;----&amp;gt;&lt;BR /&gt;year(UPDATE_DATE)=2022 AND month(UPDATE_DATE)=9</description>
      <pubDate>Mon, 31 Oct 2022 12:34:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reduce-run-time/m-p/841620#M332789</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2022-10-31T12:34:58Z</dc:date>
    </item>
    <item>
      <title>Re: Reduce run time</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reduce-run-time/m-p/841621#M332790</link>
      <description>&lt;P&gt;Sorting is slow. First I would consider whether you really need to sort.&amp;nbsp; Perhaps a hash-table or some other method might be an alternative.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I would think (hope) that the PROC SORT with the WHERE would be faster than using two steps.&amp;nbsp; But in both approaches, assuming your data is not indexed, you need to read all 100m rows just to select the rows you want.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The use of INPUT(PUT()) might be expensive.&amp;nbsp; It looks like you are selecting all the September 2022 updates?&amp;nbsp; I would try:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;
proc sort data=ttt(Where=('01Sep2022'd&amp;lt;=UPDATE_DATE&amp;lt;='30Sep2022'd)) out=want;
by   UPDATE_DATE    FK_A       FK_B    REFERENCE_DATE  ;
Run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 31 Oct 2022 12:37:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reduce-run-time/m-p/841621#M332790</guid>
      <dc:creator>Quentin</dc:creator>
      <dc:date>2022-10-31T12:37:58Z</dc:date>
    </item>
    <item>
      <title>Re: Reduce run time</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reduce-run-time/m-p/841622#M332791</link>
      <description>&lt;P&gt;Way 1 should be the most efficient, only one pass of the data.&lt;/P&gt;
&lt;P&gt;Question, what are the constraints in your system?&lt;/P&gt;
&lt;P&gt;No of cores, system memory?&lt;/P&gt;
&lt;P&gt;Your settings of MEMESIZE and SORTSIZE?&lt;/P&gt;
&lt;P&gt;If you could sent the log with&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;options stimer msglevel=i;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Generally speaking, you want to avoid too many function calls in your where clauses. Maybe you can use BETWEEN... AND.. logic instead?&lt;/P&gt;
&lt;P&gt;What is the ratio of your subset? If less than 10% you could consider indexing on update_date.&lt;/P&gt;</description>
      <pubDate>Mon, 31 Oct 2022 12:37:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reduce-run-time/m-p/841622#M332791</guid>
      <dc:creator>LinusH</dc:creator>
      <dc:date>2022-10-31T12:37:27Z</dc:date>
    </item>
    <item>
      <title>Re: Reduce run time</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reduce-run-time/m-p/841623#M332792</link>
      <description>&lt;P&gt;Maybe Way1 is faster, but why don't you try it anyway?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Also...try&amp;nbsp;&lt;/P&gt;
&lt;P&gt;-Increase MEMSIZE and REALMEMSIZE allocations.&lt;/P&gt;
&lt;P&gt;-Use TAG sorting.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;-Get better hardware.&lt;/STRONG&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 31 Oct 2022 12:40:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reduce-run-time/m-p/841623#M332792</guid>
      <dc:creator>japelin</dc:creator>
      <dc:date>2022-10-31T12:40:17Z</dc:date>
    </item>
    <item>
      <title>Re: Reduce run time</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reduce-run-time/m-p/841624#M332793</link>
      <description>&lt;P&gt;Way 1 requires one data set to be written, Way 2 requires 2 data sets to be written. I would guess that Way 1 wins in speed just for that reason.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Why the manipulation of dates with INPUT(PUT())? That's got to be slower than just doing tests on the variable UPDATE_DATE. Compute &amp;amp;start_date and &amp;amp;end_date from&amp;nbsp;&lt;FONT face="courier new,courier"&gt;%let YYMM=2209;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;%let start_date='01SEP2022'd;
%let end_date='30SEP2022'd;

proc sort data=ttt(Where=(&amp;amp;start_date&amp;lt;=update_date&amp;lt;=&amp;amp;end_date)) out=want;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Don't do this part:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;UPDATE_DATE  ne '16MAY2022'd&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;unless &amp;amp;YYMM = 2205&lt;/P&gt;</description>
      <pubDate>Mon, 31 Oct 2022 12:45:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reduce-run-time/m-p/841624#M332793</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2022-10-31T12:45:57Z</dc:date>
    </item>
    <item>
      <title>Re: Reduce run time</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reduce-run-time/m-p/841625#M332794</link>
      <description>&lt;P&gt;What about variables? Reducing the number of variables through KEEP= or DROP= will reduce the runtime, so see if you need all the variables from the input dataset in your result.&lt;/P&gt;</description>
      <pubDate>Mon, 31 Oct 2022 12:45:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reduce-run-time/m-p/841625#M332794</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2022-10-31T12:45:12Z</dc:date>
    </item>
    <item>
      <title>Re: Reduce run time</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reduce-run-time/m-p/841629#M332796</link>
      <description>&lt;P&gt;Your solution is perfect,&lt;/P&gt;
&lt;P&gt;It still takes a few minutes to run it but before it run very long time&lt;/P&gt;</description>
      <pubDate>Mon, 31 Oct 2022 13:04:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reduce-run-time/m-p/841629#M332796</guid>
      <dc:creator>Ronein</dc:creator>
      <dc:date>2022-10-31T13:04:12Z</dc:date>
    </item>
    <item>
      <title>Re: Reduce run time</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reduce-run-time/m-p/841631#M332798</link>
      <description>1                                                          The SAS System                             14:41 Monday, October 31, 2022&lt;BR /&gt;&lt;BR /&gt;1          ;*';*";*/;quit;run;&lt;BR /&gt;2          OPTIONS PAGENO=MIN;&lt;BR /&gt;3          %LET _CLIENTTASKLABEL='Program';&lt;BR /&gt;4          %LET _CLIENTPROCESSFLOWNAME='Process Flow';&lt;BR /&gt;5          %LET _CLIENTPROJECTPATH='';&lt;BR /&gt;6          %LET _CLIENTPROJECTPATHHOST='';&lt;BR /&gt;7          %LET _CLIENTPROJECTNAME='';&lt;BR /&gt;8          %LET _SASPROGRAMFILE='';&lt;BR /&gt;9          %LET _SASPROGRAMFILEHOST='';&lt;BR /&gt;10         &lt;BR /&gt;11         ODS _ALL_ CLOSE;&lt;BR /&gt;12         OPTIONS DEV=PNG;&lt;BR /&gt;13         GOPTIONS XPIXELS=0 YPIXELS=0;&lt;BR /&gt;14         FILENAME EGSR TEMP;&lt;BR /&gt;15         ODS tagsets.sasreport13(ID=EGSR) FILE=EGSR&lt;BR /&gt;16             STYLE=HTMLBlue&lt;BR /&gt;17             STYLESHEET=(URL="file:///C:/Program%20Files/SASHome/SASEnterpriseGuide/7.1/Styles/HTMLBlue.css")&lt;BR /&gt;18             NOGTITLE&lt;BR /&gt;19             NOGFOOTNOTE&lt;BR /&gt;20             GPATH=&amp;amp;sasworklocation&lt;BR /&gt;21             ENCODING=UTF8&lt;BR /&gt;22             options(rolap="on")&lt;BR /&gt;23         ;&lt;BR /&gt;NOTE: Writing TAGSETS.SASREPORT13(EGSR) Body file: EGSR&lt;BR /&gt;24         &lt;BR /&gt;25         GOPTIONS ACCESSIBLE;&lt;BR /&gt;26         options stimer msglevel=i;&lt;BR /&gt;27         &lt;BR /&gt;28         GOPTIONS NOACCESSIBLE;&lt;BR /&gt;29         %LET _CLIENTTASKLABEL=;&lt;BR /&gt;30         %LET _CLIENTPROCESSFLOWNAME=;&lt;BR /&gt;31         %LET _CLIENTPROJECTPATH=;&lt;BR /&gt;32         %LET _CLIENTPROJECTPATHHOST=;&lt;BR /&gt;33         %LET _CLIENTPROJECTNAME=;&lt;BR /&gt;34         %LET _SASPROGRAMFILE=;&lt;BR /&gt;35         %LET _SASPROGRAMFILEHOST=;&lt;BR /&gt;36         &lt;BR /&gt;37         ;*';*";*/;quit;run;&lt;BR /&gt;38         ODS _ALL_ CLOSE;&lt;BR /&gt;39         &lt;BR /&gt;40         &lt;BR /&gt;41         QUIT; RUN;&lt;BR /&gt;42         &lt;BR /&gt;</description>
      <pubDate>Mon, 31 Oct 2022 13:05:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reduce-run-time/m-p/841631#M332798</guid>
      <dc:creator>Ronein</dc:creator>
      <dc:date>2022-10-31T13:05:14Z</dc:date>
    </item>
    <item>
      <title>Re: Reduce run time</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reduce-run-time/m-p/841634#M332800</link>
      <description>Sure but in this case the main problem was not due to number of variables....</description>
      <pubDate>Mon, 31 Oct 2022 13:12:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reduce-run-time/m-p/841634#M332800</guid>
      <dc:creator>Ronein</dc:creator>
      <dc:date>2022-10-31T13:12:38Z</dc:date>
    </item>
    <item>
      <title>Re: Reduce run time</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reduce-run-time/m-p/841635#M332801</link>
      <description>&lt;P&gt;In September there were observations of&amp;nbsp; different update_date and I should ignore observations from 16MAY2022.&lt;/P&gt;
&lt;P&gt;In other months there is no problem and just need to select all rows with YYMM&amp;nbsp; as user put&lt;/P&gt;</description>
      <pubDate>Mon, 31 Oct 2022 13:14:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reduce-run-time/m-p/841635#M332801</guid>
      <dc:creator>Ronein</dc:creator>
      <dc:date>2022-10-31T13:14:42Z</dc:date>
    </item>
    <item>
      <title>Re: Reduce run time</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reduce-run-time/m-p/841649#M332805</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/159549"&gt;@Ronein&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;In September there were observations of&amp;nbsp; different update_date and I should ignore observations from 16MAY2022.&lt;/P&gt;
&lt;P&gt;In other months there is no problem and just need to select all rows with YYMM&amp;nbsp; as user put&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Thanks for the explanation. You can still create macro code that only does the test for 16MAY2022 for the month where it is a problem and not do the test for months when it is not a problem. This should help the speed issue. Perhaps you have done that already in your real code.&lt;/P&gt;</description>
      <pubDate>Mon, 31 Oct 2022 14:01:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reduce-run-time/m-p/841649#M332805</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2022-10-31T14:01:17Z</dc:date>
    </item>
    <item>
      <title>Re: Reduce run time</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reduce-run-time/m-p/841651#M332807</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/159549"&gt;@Ronein&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;In September there were observations of&amp;nbsp; different update_date and I should ignore observations from 16MAY2022.&lt;/P&gt;
&lt;P&gt;In other months there is no problem and just need to select all rows with YYMM&amp;nbsp; as user put&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;I think you mean "In &lt;STRONG&gt;May&amp;nbsp;&lt;/STRONG&gt;there were observations of&amp;nbsp; different update_date and I should ignore observations from 16MAY2022" ?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 31 Oct 2022 14:10:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reduce-run-time/m-p/841651#M332807</guid>
      <dc:creator>Quentin</dc:creator>
      <dc:date>2022-10-31T14:10:49Z</dc:date>
    </item>
    <item>
      <title>Re: Reduce run time</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reduce-run-time/m-p/841652#M332808</link>
      <description>May you please show code?</description>
      <pubDate>Mon, 31 Oct 2022 14:12:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reduce-run-time/m-p/841652#M332808</guid>
      <dc:creator>Ronein</dc:creator>
      <dc:date>2022-10-31T14:12:07Z</dc:date>
    </item>
    <item>
      <title>Re: Reduce run time</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reduce-run-time/m-p/841655#M332810</link>
      <description>&lt;P&gt;Something like this:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;%let YYMM=2209;
data _null_;
    call symputx('start_date',input("&amp;amp;yymm",yymmn4.));
    call symputx('end_date',intnx('month',input("&amp;amp;yymm",yymmn4.),0,'e'));
run;

%if &amp;amp;yymm=2209 %then %do;
proc sort data=ttt(Where=(UPDATE_DATE ne '16MAY2022'd AND &amp;amp;start_date&amp;lt;=update_date&amp;lt;=&amp;amp;end_date)) out=want;
%end;
%else %do;
proc sort data=ttt(Where=(&amp;amp;start_date&amp;lt;=update_date&amp;lt;=&amp;amp;end_date)) out=want;
%end;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;or even simpler&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sort data=ttt(Where=(%if &amp;amp;yymm=2209 %then %do; UPDATE_DATE ne '16MAY2022'd AND %end; 
    &amp;amp;start_date&amp;lt;=update_date&amp;lt;=&amp;amp;end_date)) out=want;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 31 Oct 2022 14:33:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reduce-run-time/m-p/841655#M332810</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2022-10-31T14:33:27Z</dc:date>
    </item>
  </channel>
</rss>

