<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: how to run faster if it reads several billions rows in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/622411#M183084</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;SPAN class="login-bold"&gt;&lt;A id="link_8" class="lia-link-navigation lia-page-link lia-user-name-link" href="https://communities.sas.com/t5/user/viewprofilepage/user-id/11562" target="_self"&gt;KurtBremser&lt;/A&gt;,&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="login-bold"&gt;Thanks a lot for your quick reply. But I have almost 35 variables in&amp;nbsp;Acquisition_Score_Distinct&amp;nbsp; table, so you are saying I have to write all the variable names individually!!! The might be a tedious task, but if you are recommending then will do the same.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="login-bold"&gt;Also, just one quick ask. Apart from the above task I am hindering into another situation as well with same tables as above, how to rectify the same can you please guide. The below is the sample with sashelp.class table.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data test;
set sashelp.class (obs=10);
run;
data test1;
set sashelp.class(obs=15);
run;

proc ds2;
 thread thread_name1 / overwrite = yes;
 method run();
 set {Select distinct t1.name,t2.sex,t2.age
from test t1 left
join test1 t2 on 
strip(upcase(t1.name)) =* strip(upcase(t2.name))};
 end;
 endthread;

 data test3 / overwrite = yes;
 dcl thread thread_name1 t;
 method run();
 set from t threads = 4;
 end;
 enddata;
 run;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;SPAN class="login-bold"&gt;Log:&lt;/SPAN&gt;&lt;/P&gt;
&lt;PRE&gt;23         GOPTIONS ACCESSIBLE;
24         proc ds2;
25          thread thread_name1 / overwrite = yes;
26          method run();
27          set {Select distinct t1.name,t2.sex,t2.age
28         from test t1 left
29         join test1 t2 on
30         strip(upcase(t1.name)) =* strip(upcase(t2.name))};
31          end;
32          endthread;
33         
34          data test3 / overwrite = yes;
35          dcl thread thread_name1 t;
36          method run();
37          set from t threads = 4;
38          end;
39          enddata;
40          run;
ERROR: Compilation error.
ERROR: Operator does not exist:  NVARCHAR =* NVARCHAR
WARNING: No operator matches the given name and argument type(s). You might need to add explicit typecasts.
ERROR: Line 27: Unable to prepare SELECT statement for table {Select distinct t1.name,t2.sex,t2.age  from test t1 left  
       join test1 t2 on  strip(upcase(t1.name)) =* strip(upcase(t2.name))} (rc=0x80fff802U).
NOTE: PROC DS2 has set option NOEXEC and will continue to prepare statements.
41         quit;

NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE DS2 used (Total process time):
      real time           1.07 seconds
      cpu time            0.35 seconds&lt;/PRE&gt;
&lt;P&gt;&lt;SPAN class="login-bold"&gt;The same above code is working in the PROC SQL completely fine, but in the DS2 Fedsql, it's asking to cast. I tried to cast with setnvarchar method but still I am missing something I think.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="login-bold"&gt;The =* is giving the maximum catch, I tried with&amp;nbsp; =,contains, find(t2.name,t1.name) those are trying to match exactly, so t2 table column values are appearing as missing. My concern is how we can use the =* in proc ds2.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="login-bold"&gt;Please guide.&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 05 Feb 2020 11:26:36 GMT</pubDate>
    <dc:creator>rajdeep</dc:creator>
    <dc:date>2020-02-05T11:26:36Z</dc:date>
    <item>
      <title>how to run faster if it reads several billions rows</title>
      <link>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/358997#M84406</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
create table  lib1.want as
         select *
            from have
			where va1=2
           ;
quit;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;will use datastep faster than proc sql?&lt;/P&gt;
&lt;P&gt;It is estimated that it will run two hours to get the new dataset I want based on the code above. So it is hard to try different code. I would like to get advice from you on how to run it efficiently and faster. Thank you&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 16 May 2017 13:03:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/358997#M84406</guid>
      <dc:creator>Bal23</dc:creator>
      <dc:date>2017-05-16T13:03:12Z</dc:date>
    </item>
    <item>
      <title>Re: how to run faster if it reads several billions rows</title>
      <link>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/359014#M84416</link>
      <description>&lt;P&gt;Since that is a sequential read through the dataset without much processing, you won't gain anything by using a data step. You're simply bound by the sheer amount of data that has to be read.&lt;/P&gt;
&lt;P&gt;Depending on the structure of have, you might gain something (or even a lot) if you use the compress=yes option when you store it, and do similar for dataset want.&lt;/P&gt;</description>
      <pubDate>Tue, 16 May 2017 13:23:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/359014#M84416</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2017-05-16T13:23:46Z</dc:date>
    </item>
    <item>
      <title>Re: how to run faster if it reads several billions rows</title>
      <link>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/359016#M84418</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you the following can be applied to you, then your query should run faster:&lt;/P&gt;
&lt;P&gt;- Your under lying storage system, allows for 50-75 MB/sec sustained I/O throughput&lt;/P&gt;
&lt;P&gt;- Use SPDE/SPDS for your Output Library (lib1) to speed the writing of your output table&lt;/P&gt;
&lt;P&gt;&lt;A href="http://support.sas.com/rnd/scalability/spde/index.html" target="_self"&gt;http://support.sas.com/rnd/scalability/spde/index.html&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;- Use Index for your Va1 variable to speed the reading of your subset/filtered data&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Hope this helps,&lt;/P&gt;
&lt;P&gt;Ahmed&lt;/P&gt;</description>
      <pubDate>Tue, 16 May 2017 13:23:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/359016#M84418</guid>
      <dc:creator>AhmedAl_Attar</dc:creator>
      <dc:date>2017-05-16T13:23:32Z</dc:date>
    </item>
    <item>
      <title>Re: how to run faster if it reads several billions rows</title>
      <link>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/359017#M84419</link>
      <description>&lt;P&gt;As an addendum to what &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13868"&gt;@AhmedAl_Attar&lt;/a&gt; said, if you can throw hardware at the problem, consider to set up an array of disks and span a SPDE library over those. For SPDE to really improve your throughput, you need to remove the bottleneck caused by a single disk having to carry all your storage traffic.&lt;/P&gt;</description>
      <pubDate>Tue, 16 May 2017 13:27:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/359017#M84419</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2017-05-16T13:27:08Z</dc:date>
    </item>
    <item>
      <title>Re: how to run faster if it reads several billions rows</title>
      <link>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/359049#M84432</link>
      <description>&lt;P&gt;Assuming you are going to read the file several times, each time with different VA1 value,&lt;/P&gt;
&lt;P&gt;and the file is mostly static,&amp;nbsp;then do the next preparing steps &lt;U&gt;once&lt;/U&gt;:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;- use option &lt;STRONG&gt;compress=yes&amp;nbsp;&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;- &lt;STRONG&gt;sort&lt;/STRONG&gt; the file by VA1 or by variables used for WHERE selection and&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp;create the output with &lt;STRONG&gt;indexes&lt;/STRONG&gt; of those variables.&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp;Pay attention - indexes needs storage&lt;/P&gt;
&lt;P&gt;- If applicable use SPDE/SPDS for the sorted output&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thus, using advices of&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/11562"&gt;@Kurt_Bremser&lt;/a&gt;&amp;nbsp;and of&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13868"&gt;@AhmedAl_Attar&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The code should look like:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;libname input '.....';
libname output SPDE '.......';

proc sort data=input.have
                 out=output.want (index=VA1 ....);  /* add expected variables used with WHERE */
         by  VA1 ..... ;  /* same variables defined for INDEX */
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 16 May 2017 14:30:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/359049#M84432</guid>
      <dc:creator>Shmuel</dc:creator>
      <dc:date>2017-05-16T14:30:32Z</dc:date>
    </item>
    <item>
      <title>Re: how to run faster if it reads several billions rows</title>
      <link>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/359136#M84458</link>
      <description>&lt;P&gt;One more potential way to speed your query, would be&lt;/P&gt;
&lt;P&gt;- if you have SAS 9.4, you may want to try the Threaded Read feature of Proc DS2, that's available in BASE SAS.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Here is an implementation example&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;options threads; /* Tells SAS to use threaded processing */
%let have_ds = have;
%let filter= va1=2;

proc ds2;
	thread newton/overwrite=yes;
		/* Some debugging variables */
		dcl double count thisThread;
		drop count thisThread;

		method run();
			SET {select * from &amp;amp;have_ds where &amp;amp;filter}; /* Ability to use FedSql query within DS2 */

			/* Assign/Initialize debugging vars */
			thisThread=_threadid_;
			count+1;
		end;

		method term();
			put '**Thread' _threadid_ 'processed'  count 'rows:';
		end;
	endthread;

	data lib1.want/overwrite=yes;
		dcl thread newton frac; /* Declare an Instance of the newton thread */

		method run();
			set from frac threads=4; /* &amp;lt;--- You Change the number of threads to control the threaded reads */
		end;
	enddata;
run;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Please note, according to the &lt;A href="http://support.sas.com/documentation/cdl/en/ds2ref/69739/HTML/default/viewer.htm#p0qykqw1fdra8dn1449vxg9ydfkk.htm" target="_self"&gt;Proc DS2 documentation&lt;/A&gt;, "If one computation thread can keep up with the I/O thread, then that single thread is used for all computation."&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Ahmed&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 16 May 2017 18:03:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/359136#M84458</guid>
      <dc:creator>AhmedAl_Attar</dc:creator>
      <dc:date>2017-05-16T18:03:34Z</dc:date>
    </item>
    <item>
      <title>Re: how to run faster if it reads several billions rows</title>
      <link>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/622026#M182931</link>
      <description>&lt;P&gt;hi&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Can you please suggest how to run the below code in PROC DS2, because it's throwing error. In the other hand it's running successfully in generic PROC SQL.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
SELECT DISTINCT t1.*,t2.DATE_MN,
          /* Calculation */
            (SUM(DISTINCT(t2.CURRENCY_AMOUNT))) FORMAT=15.2 AS ALL_TRANSACTION
      FROM  WORK.Acquisition_Score_Distinct t1
           LEFT JOIN WORK.ALL_TRAN_TYPE_KEY_FILTER t2 ON (t1.ACCOUNT_NUMBER = t2.ACCOUNT_NUMBER)
     GROUP BY t1.ACCOUNT_NUMBER,t2.DATE_MN
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;The above left and right tables are having 10 cores data, so want to use PROC DS2 for the same. Please suggest.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks in advance.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 03 Feb 2020 19:48:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/622026#M182931</guid>
      <dc:creator>rajdeep</dc:creator>
      <dc:date>2020-02-03T19:48:24Z</dc:date>
    </item>
    <item>
      <title>Re: how to run faster if it reads several billions rows</title>
      <link>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/622030#M182933</link>
      <description>&lt;P&gt;Whenever code throws an error that you can't make sense of,&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;&lt;FONT size="4"&gt;POST THE LOG.&lt;/FONT&gt;&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT size="2"&gt;Use the {i} button to post the log. &lt;STRONG&gt;Do not skip this step.&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 03 Feb 2020 20:05:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/622030#M182933</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2020-02-03T20:05:17Z</dc:date>
    </item>
    <item>
      <title>Re: how to run faster if it reads several billions rows</title>
      <link>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/622116#M182979</link>
      <description>&lt;P&gt;And what is a "10 cores data"?&lt;/P&gt;
&lt;P&gt;If you want to give us information about the size of your data, use orders of magnitude that are easily understood everywhere - thousand, million, billion - or, even better, designations like K (kilo), M (mega), G (giga), T (tera), and so on.&lt;/P&gt;</description>
      <pubDate>Tue, 04 Feb 2020 06:20:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/622116#M182979</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2020-02-04T06:20:38Z</dc:date>
    </item>
    <item>
      <title>Re: how to run faster if it reads several billions rows</title>
      <link>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/622159#M183005</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;SPAN class="login-bold"&gt;&lt;A id="link_8" class="lia-link-navigation lia-page-link lia-user-name-link" href="https://communities.sas.com/t5/user/viewprofilepage/user-id/11562" target="_self"&gt;KurtBremser&lt;/A&gt;,&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="login-bold"&gt;I am extremely sorry for the data magnitude, it's 100 million I was trying to convey. Also, I want to highlight that the above query is running fine in generic PROC SQL, but failing in the PROC DS2 Multi-threading style in EG 7.1 windows Environment.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="login-bold"&gt;The below is the code and log:&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc ds2; 
	thread newton/overwrite=yes;
			method run();
			SET { SELECT DISTINCT t1.*,t2.DATE_MN,
          /* Calculation */
            (SUM(DISTINCT(t2.CURRENCY_AMOUNT))) AS ALL_TRANSACTION
      FROM  WORK.Acquisition_Score_Distinct t1
           LEFT JOIN WORK.ALL_TRAN_TYPE_KEY_FILTER t2 ON (t1.ACCOUNT_NUMBER = t2.ACCOUNT_NUMBER)
     GROUP BY t1.ACCOUNT_NUMBER,t2.DATE_MN}; /* Ability to use FedSql query within DS2 */

				end;

	endthread;
data All_Month_Transactions/overwrite=yes;
		dcl thread newton frac1; 

		method run();
			set from frac1 threads=4; 
		end;
	enddata;
run;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;SPAN class="login-bold"&gt;Log:&lt;/SPAN&gt;&lt;/P&gt;
&lt;PRE&gt;NOTE: Writing TAGSETS.SASREPORT13(EGSR) Body file: EGSR
22         
23         GOPTIONS ACCESSIBLE;
24         proc ds2; /*Score column mapping based on Postal code and FSA*/
25         	thread newton/overwrite=yes;
26         			method run();
27         			SET { SELECT DISTINCT t1.*,t2.DATE_MN,
28                   /* Calculation */
29                     (SUM(DISTINCT(t2.CURRENCY_AMOUNT))) AS ALL_TRANSACTION
30               FROM  WORK.Acquisition_Score_Distinct t1
31                    LEFT JOIN WORK.ALL_TRAN_TYPE_KEY_FILTER t2 ON (t1.ACCOUNT_NUMBER = t2.ACCOUNT_NUMBER)
32              GROUP BY t1.ACCOUNT_NUMBER,t2.DATE_MN}; /* Ability to use FedSql query within DS2 */
33         
34         				end;
35         
36         	endthread;
37         data All_Month_Transactions/overwrite=yes;
38         		dcl thread newton frac1; /* Declare an Instance of the newton thread */
39         
40         		method run();
41         			set from frac1 threads=4; /* &amp;lt;--- You Change the number of threads to control the threaded reads */
42         		end;
43         	enddata;
44         run;
ERROR: Compilation error.
ERROR: Column "T1.PCF_CUST_ID" must be GROUPed or used in an aggregate function
ERROR: Line 27: Unable to prepare SELECT statement for table { SELECT DISTINCT t1.*,t2.DATE_MN,   
       (SUM(DISTINCT(t2.CURRENCY_AMOUNT))) AS ALL_TRANSACTION  FROM WORK.Acquisition_Score_Distinct t1  LEFT JOIN 
       WORK.ALL_TRAN_TYPE_KEY_FILTER t2 ON (t1.ACCOUNT_NUMBER = t2.ACCOUNT_NUMBER)  GROUP BY t1.ACCOUNT_NUMBER,t2.DATE_MN} 
       (rc=0x80fff802U).
NOTE: PROC DS2 has set option NOEXEC and will continue to prepare statements.
45         quit;

NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE DS2 used (Total process time)&lt;/PRE&gt;
&lt;P&gt;The above log is asking to include "T1.PCF_CUST_ID" in the grouping, but I am not using the same column at all in my code. Neither in the group by nor with any aggregate function.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Please help if I understood something wrong or if there is any syntax mistake with PROC DS2. Please let me know if any other details are required.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks in Advance.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 04 Feb 2020 10:03:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/622159#M183005</guid>
      <dc:creator>rajdeep</dc:creator>
      <dc:date>2020-02-04T10:03:45Z</dc:date>
    </item>
    <item>
      <title>Re: how to run faster if it reads several billions rows</title>
      <link>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/622165#M183007</link>
      <description>&lt;P&gt;You use the asterisk for all variables of t1, so they must be accounted for. Use a discrete list of variables instead, so you don't get&amp;nbsp;PCF_CUST_ID included implicitly (and you don't need DISTINCT IMO):&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;			SET { SELECT t1.ACCOUNT_NUMBER,t2.DATE_MN,
          /* Calculation */
            (SUM(DISTINCT(t2.CURRENCY_AMOUNT))) AS ALL_TRANSACTION
      FROM  WORK.Acquisition_Score_Distinct t1
           LEFT JOIN WORK.ALL_TRAN_TYPE_KEY_FILTER t2 ON (t1.ACCOUNT_NUMBER = t2.ACCOUNT_NUMBER)
     GROUP BY t1.ACCOUNT_NUMBER,t2.DATE_MN}; /* Ability to use FedSql query within DS2 */
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 04 Feb 2020 12:04:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/622165#M183007</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2020-02-04T12:04:15Z</dc:date>
    </item>
    <item>
      <title>Re: how to run faster if it reads several billions rows</title>
      <link>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/622411#M183084</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;SPAN class="login-bold"&gt;&lt;A id="link_8" class="lia-link-navigation lia-page-link lia-user-name-link" href="https://communities.sas.com/t5/user/viewprofilepage/user-id/11562" target="_self"&gt;KurtBremser&lt;/A&gt;,&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="login-bold"&gt;Thanks a lot for your quick reply. But I have almost 35 variables in&amp;nbsp;Acquisition_Score_Distinct&amp;nbsp; table, so you are saying I have to write all the variable names individually!!! The might be a tedious task, but if you are recommending then will do the same.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="login-bold"&gt;Also, just one quick ask. Apart from the above task I am hindering into another situation as well with same tables as above, how to rectify the same can you please guide. The below is the sample with sashelp.class table.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data test;
set sashelp.class (obs=10);
run;
data test1;
set sashelp.class(obs=15);
run;

proc ds2;
 thread thread_name1 / overwrite = yes;
 method run();
 set {Select distinct t1.name,t2.sex,t2.age
from test t1 left
join test1 t2 on 
strip(upcase(t1.name)) =* strip(upcase(t2.name))};
 end;
 endthread;

 data test3 / overwrite = yes;
 dcl thread thread_name1 t;
 method run();
 set from t threads = 4;
 end;
 enddata;
 run;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;SPAN class="login-bold"&gt;Log:&lt;/SPAN&gt;&lt;/P&gt;
&lt;PRE&gt;23         GOPTIONS ACCESSIBLE;
24         proc ds2;
25          thread thread_name1 / overwrite = yes;
26          method run();
27          set {Select distinct t1.name,t2.sex,t2.age
28         from test t1 left
29         join test1 t2 on
30         strip(upcase(t1.name)) =* strip(upcase(t2.name))};
31          end;
32          endthread;
33         
34          data test3 / overwrite = yes;
35          dcl thread thread_name1 t;
36          method run();
37          set from t threads = 4;
38          end;
39          enddata;
40          run;
ERROR: Compilation error.
ERROR: Operator does not exist:  NVARCHAR =* NVARCHAR
WARNING: No operator matches the given name and argument type(s). You might need to add explicit typecasts.
ERROR: Line 27: Unable to prepare SELECT statement for table {Select distinct t1.name,t2.sex,t2.age  from test t1 left  
       join test1 t2 on  strip(upcase(t1.name)) =* strip(upcase(t2.name))} (rc=0x80fff802U).
NOTE: PROC DS2 has set option NOEXEC and will continue to prepare statements.
41         quit;

NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE DS2 used (Total process time):
      real time           1.07 seconds
      cpu time            0.35 seconds&lt;/PRE&gt;
&lt;P&gt;&lt;SPAN class="login-bold"&gt;The same above code is working in the PROC SQL completely fine, but in the DS2 Fedsql, it's asking to cast. I tried to cast with setnvarchar method but still I am missing something I think.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="login-bold"&gt;The =* is giving the maximum catch, I tried with&amp;nbsp; =,contains, find(t2.name,t1.name) those are trying to match exactly, so t2 table column values are appearing as missing. My concern is how we can use the =* in proc ds2.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="login-bold"&gt;Please guide.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 05 Feb 2020 11:26:36 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/622411#M183084</guid>
      <dc:creator>rajdeep</dc:creator>
      <dc:date>2020-02-05T11:26:36Z</dc:date>
    </item>
    <item>
      <title>Re: how to run faster if it reads several billions rows</title>
      <link>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/622421#M183086</link>
      <description>&lt;P&gt;When you use 35 variables in a SQL select that also has a group by, every one of those 35 variables must either be one of the group variables, or subject to a summary function.&lt;/P&gt;
&lt;P&gt;It is only in PROC SQL that SAS allows you to have additional variables, and then it does the automatic remerge, something that is not done in ANSI SQL or FedSQL.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And =* is not a valid operator. From where did you get that? If you want a "not equal", use the mnemonic ne or ^=.&lt;/P&gt;</description>
      <pubDate>Wed, 05 Feb 2020 12:43:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/622421#M183086</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2020-02-05T12:43:07Z</dc:date>
    </item>
    <item>
      <title>Re: how to run faster if it reads several billions rows</title>
      <link>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/622449#M183100</link>
      <description>&lt;P&gt;If you want to remerge the aggregate function results then you will have to add the code to do it yourself.&lt;/P&gt;
&lt;P&gt;So instead of doing code like this that works in PROC SQL:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;select *,mean(age) as gender_mean_age
from sashelp.class
group by sex
;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;You will need to do something like&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;select a.*,b.gender_mean_age
from sashelp.class a
inner join 
 (select age,mean(age) as gender_mean_age
  from sashelp.class group by sex
 ) b
on a.sex=b.sex
;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 05 Feb 2020 14:06:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/622449#M183100</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2020-02-05T14:06:59Z</dc:date>
    </item>
    <item>
      <title>Re: how to run faster if it reads several billions rows</title>
      <link>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/623813#M183713</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class="login-bold"&gt;&lt;A id="link_8" class="lia-link-navigation lia-page-link lia-user-name-link" href="https://communities.sas.com/t5/user/viewprofilepage/user-id/11562" target="_self"&gt;KurtBremser&lt;/A&gt;,&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="login-bold"&gt;Sorry for the delayed response.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="login-bold"&gt;for the =* you can find the reference in the below link.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="login-bold"&gt;&lt;A href="https://documentation.sas.com/?docsetId=lestmtsref&amp;amp;docsetTarget=n1xbr9r0s9veq0n137iftzxq4g7e.htm&amp;amp;docsetVersion=9.4&amp;amp;locale=en"&gt;https://documentation.sas.com/?docsetId=lestmtsref&amp;amp;docsetTarget=n1xbr9r0s9veq0n137iftzxq4g7e.htm&amp;amp;docsetVersion=9.4&amp;amp;locale=en&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="login-bold"&gt;Basically, it's something sounds like kind of concept.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="login-bold"&gt;But anyways, thanks a lot for your quick action and response. Just wanted to know like can you please explain how we can use INPUTN function inside FEDSQL query?&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;
proc ds2; /*Score column mapping based on Postal code and FSA*/
	thread newton/overwrite=yes;
dcl double Level;
	  method convert(double level) returns double;
 set LIB_TS23.CIF_ACCOUNT_CURR;
 Level=inputn(CIFP_CUSTOM_DATA_81,8.);
 return Level;
 end;
		method run();
			SET {select distinct
      TRIM(LEFT(PUT(CIF_ACCOUNT_CURR.CIFP_ACCOUNT_ID5,BEST32.))) as MAST_ACCOUNT_ID
   from LIB_TS23.CIF_ACCOUNT_CURR as CIF_ACCOUNT_CURR 
inner join
      LIB_ADM.ADM_STD_DISP_USER_V as ADM_STD_DISP_USER_V
   on  ADM_STD_DISP_USER_V.APA_APP_NUM = INPUTN(CIF_ACCOUNT_CURR.CIFP_CUSTOM_DATA_81,8.)
      and CIF_ACCOUNT_CURR.CIFP_OFFICER_NBR = 'ADMACE'}; /* Ability to use FedSql query within DS2 */

				end;
	endthread;
data test/overwrite=yes;
		dcl thread newton frac1; /* Declare an Instance of the newton thread */
		method run();
			set from frac1 threads=4; /* &amp;lt;--- You Change the number of threads to control the threaded reads */
		end;
	enddata;
run;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;SPAN class="login-bold"&gt;Log:&lt;/SPAN&gt;&lt;/P&gt;
&lt;PRE&gt;ERROR: Compilation error.
ERROR: Function INPUT(VARCHAR, DOUBLE) does not exist
WARNING: No function matches the given name and argument types.
ERROR: Line 34: Unable to prepare SELECT statement for table {select distinct  
       TRIM(LEFT(PUT(CIF_ACCOUNT_CURR.CIFP_ACCOUNT_ID5,BEST32.))) as MAST_ACCOUNT_ID  from   LIB_TS23.CIF_ACCOUNT_CURR as 
       CIF_ACCOUNT_CURR  inner join  LIB_ADM.ADM_STD_DISP_USER_V as ADM_STD_DISP_USER_V  on   ADM_STD_DISP_USER_V.APA_APP_NUM 
       = INPUT(CIF_ACCOUNT_CURR.CIFP_CUSTOM_DATA_81,8.)  and CIF_ACCOUNT_CURR.CIFP_OFFICER_NBR = 'ADMACE'} (rc=0x80fff802U).
NOTE: PROC DS2 has set option NOEXEC and will continue to prepare statements.&lt;/PRE&gt;
&lt;P&gt;&lt;SPAN class="login-bold"&gt;Like I tried to convert the same in method Convert() it's happening successfully, but when I tried to do the same in Fedsql, it's showing above error. Please guide how we can use INPUTN function in Fedsql or how to handle the above situation alternately.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="login-bold"&gt;Thanks in advance.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 11 Feb 2020 12:02:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/623813#M183713</guid>
      <dc:creator>rajdeep</dc:creator>
      <dc:date>2020-02-11T12:02:40Z</dc:date>
    </item>
    <item>
      <title>Re: how to run faster if it reads several billions rows</title>
      <link>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/624131#M183800</link>
      <description>&lt;P&gt;The INPUTN() and INPUTC() function do not expect the format name like the INPUT() function does. You need to supply the format name as a character expression, like this:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;ADM_STD_DISP_USER_V.APA_APP_NUM = INPUTN(CIF_ACCOUNT_CURR.CIFP_CUSTOM_DATA_81,'8.')&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Note that in FedSQL, you MUST use &lt;EM&gt;single&lt;/EM&gt; quotes. Double quotes are interpreted by FedSQL as "delimited identifiers" (similar to the 'some name'n name literals in Base SAS language).&lt;/P&gt;</description>
      <pubDate>Wed, 12 Feb 2020 10:07:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/how-to-run-faster-if-it-reads-several-billions-rows/m-p/624131#M183800</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2020-02-12T10:07:04Z</dc:date>
    </item>
  </channel>
</rss>

