<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Proc SQL to Data Step in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-to-Data-Step/m-p/551889#M153393</link>
    <description>&lt;P&gt;Hi, Reeza&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;No, only one condition is possible, because of the "mess" with the key's insert I had to put an "OR".&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My data is about 500k rows on GLCR_CARTEIRA and 300k on the CONTRATO_CESS&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;" Is it indexed on any variables?" No, it does not have index.&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 17 Apr 2019 20:35:37 GMT</pubDate>
    <dc:creator>vpanzagarcia</dc:creator>
    <dc:date>2019-04-17T20:35:37Z</dc:date>
    <item>
      <title>Proc SQL to Data Step</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-to-Data-Step/m-p/551885#M153391</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I was told that Data steps performs better than Proc SQL, so I need help at "translating" my code to data steps.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My proc sql is taking too long to finish, I think its because its too big the file, sometimes the Enterprise Guide crashs when i run the code.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My Proc Sql is:&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql noprint;
	create table OPER_SAC as
     SELECT DISTINCT
		GLCR_CARTEIRA_SAC.CUSTOMER_CNPJ_PARENT_L format=BEST14.,
		CONTRATO_CESS.CodigoCedente length=8 format=BEST14. AS CodigoCedente,
        CONTRATO_CESS.CPFCNPJCliente length=8 format=BEST14. AS CPFCNPJCliente,
		GLCR_CARTEIRA_SAC.NAME_L length=56 format=$56. AS NAME_L_SAC
	FROM LSRCRDPJ.GLCR_CARTEIRA GLCR_CARTEIRA_SAC 
	INNER JOIN LSRCRDPJ.CONTRATO_CESS CONTRATO_CESS
	ON (substr(put(GLCR_CARTEIRA_SAC.CUSTOMER_CNPJ_PARENT_L,14.),7,8) = (substr(put(CONTRATO_CESS.CPFCNPJCliente,14.),1,8)
	OR substr(put(GLCR_CARTEIRA_SAC.CUSTOMER_CNPJ_PARENT_L,14.),1,8) = substr(put(CONTRATO_CESS.CPFCNPJCliente,14.),1,8))
	ORDER BY GLCR_CARTEIRA_SAC.CUSTOMER_CNPJ_PARENT_L, 
		CONTRATO_CESS.CPFCNPJCliente,
		CONTRATO_CESS.CodigoCedente,
		GLCR_CARTEIRA_SAC.NAME_L
	;
QUIT;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;The key 'CUSTOMER_CNPJ_PARENT_L' and "CPFCNPJCliente' are not unique, don't ask me why, thats the data I have to work with.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;So please, if you could help me translating to data steps I'll appreciate it.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you&lt;/P&gt;</description>
      <pubDate>Wed, 17 Apr 2019 20:25:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-to-Data-Step/m-p/551885#M153391</guid>
      <dc:creator>vpanzagarcia</dc:creator>
      <dc:date>2019-04-17T20:25:47Z</dc:date>
    </item>
    <item>
      <title>Re: Proc SQL to Data Step</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-to-Data-Step/m-p/551886#M153392</link>
      <description>Your join condition, the OR is likely where the speed issues are originating. How big is your data? Is it indexed on any variables? Is that an option?&lt;BR /&gt;&lt;BR /&gt;For your join, is it possible both conditions can be met? Or will only one match?</description>
      <pubDate>Wed, 17 Apr 2019 20:30:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-to-Data-Step/m-p/551886#M153392</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2019-04-17T20:30:47Z</dc:date>
    </item>
    <item>
      <title>Re: Proc SQL to Data Step</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-to-Data-Step/m-p/551889#M153393</link>
      <description>&lt;P&gt;Hi, Reeza&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;No, only one condition is possible, because of the "mess" with the key's insert I had to put an "OR".&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My data is about 500k rows on GLCR_CARTEIRA and 300k on the CONTRATO_CESS&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;" Is it indexed on any variables?" No, it does not have index.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 17 Apr 2019 20:35:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-to-Data-Step/m-p/551889#M153393</guid>
      <dc:creator>vpanzagarcia</dc:creator>
      <dc:date>2019-04-17T20:35:37Z</dc:date>
    </item>
    <item>
      <title>Re: Proc SQL to Data Step</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-to-Data-Step/m-p/551947#M153417</link>
      <description>&lt;P&gt;&lt;EM&gt;I was told that Data steps performs better than Proc SQL, so I need help at "translating" my code to data steps.&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;This is not always true. It depends.&lt;/P&gt;
&lt;P&gt;In particular:&lt;/P&gt;
&lt;P&gt;- For non-unique keys in more than one table, a data step can be more cumbersome than SQL&lt;/P&gt;
&lt;P&gt;- When merging on several conditions (like your OR clause), SQL again is more natural&lt;/P&gt;
&lt;P&gt;- When merging by transformed values rather than variable values, data steps are again ill-equipped&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;As you can see, as it is the process is very suited to SQL rather than a data step.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Still, hash tables bring a new flexibility and you may want to try something like this:&lt;/P&gt;
&lt;LI-CODE lang="sas"&gt;proc sql;
  create table HASHDATA as 
  select unique CodigoCedente  length=8 format=best14. as CodigoCedente,
               ,CPFCNPJCliente length=8 format=best14. as CPFCNPJCliente,
               ,put(put(CPFCNPJCliente,14.),$8.)       as KEY 
  from LSRCRDPJ.CONTRATO_CESS;
quit;

data TEST;   
  set LSRCRDPJ.GLCR_CARTEIRA (keep=CUSTOMER_CNPJ_PARENT_L NAME_L );
  if _N_=1 then do;
    dcl hash H(dataset:'HASHDATA');
    H.definekey('KEY');
    H.definedata('CodigoCedente','CPFCNPJCliente');
    H.definedone();
    if 0 then set HASHDATA;
  end;
  RC=H.find(key:put(put(CUSTOMER_CNPJ_PARENT_L,14.),$8.));
  if RC=0 then output;
  *else do;
    RC=h.find(key:substr(put(CUSTOMER_CNPJ_PARENT_L,14.),7,8));
    if RC=0 then output;
  *end;
run;&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Note that&lt;/P&gt;
&lt;P&gt;1.This string extract&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="sas"&gt;&lt;SPAN class="token function"&gt;substr&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token keyword"&gt;put&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;GLCR_CARTEIRA_SAC&lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;CUSTOMER_CNPJ_PARENT_L&lt;SPAN class="token punctuation"&gt;,&lt;/SPAN&gt;&lt;SPAN class="token number"&gt;14&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;,&lt;/SPAN&gt;&lt;SPAN class="token number"&gt;1&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;,&lt;/SPAN&gt;&lt;SPAN class="token number"&gt;8&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;)&lt;/SPAN&gt; 
&lt;CODE&gt;&lt;/CODE&gt;&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;is faster if written as&lt;/P&gt;
&lt;LI-CODE lang="sas"&gt;&lt;SPAN class="token function"&gt;put&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token keyword"&gt;put&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;GLCR_CARTEIRA_SAC&lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;CUSTOMER_CNPJ_PARENT_L&lt;SPAN class="token punctuation"&gt;,&lt;/SPAN&gt;&lt;SPAN class="token number"&gt;14&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;,$&lt;/SPAN&gt;&lt;SPAN class="token number"&gt;8.&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;)&lt;/SPAN&gt;&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;2.No provision is made for duplicate key in the hash data. Tell us if the speed improves with this setup.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;3. Uncomment the do block if you want to avoid duplicate matches&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 18 Apr 2019 00:35:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-to-Data-Step/m-p/551947#M153417</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2019-04-18T00:35:12Z</dc:date>
    </item>
    <item>
      <title>Re: Proc SQL to Data Step</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-to-Data-Step/m-p/551971#M153429</link>
      <description>&lt;P&gt;PUT() doesn't do very well in SQL - You could try using a math join condition instead:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;ON mod(GLCR_CARTEIRA_SAC.CUSTOMER_CNPJ_PARENT_L,1E8) = int(CONTRATO_CESS.CPFCNPJCliente/1E6)
	OR int(GLCR_CARTEIRA_SAC.CUSTOMER_CNPJ_PARENT_L/1E6) = int(CONTRATO_CESS.CPFCNPJCliente/1E6) &lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 18 Apr 2019 02:45:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-to-Data-Step/m-p/551971#M153429</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2019-04-18T02:45:13Z</dc:date>
    </item>
    <item>
      <title>Re: Proc SQL to Data Step</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-to-Data-Step/m-p/551979#M153432</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/462"&gt;@PGStats&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Maths are indeed faster. They require all numbers to be exactly 14-digit long though.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data T1 T2; 
do I=1 to 1e4;
  J=I+1e13;
  K=I*1000+J;
  output;
end;
format J K 16.0;
run;  
       
proc sql ;
 create table X1 as
 select T1.* from T1,T2
 where put(put(T1.J,14.),$8.) =    put(put(T2.K,14.),$8.)
    or put(put(T1.J,14.),$8.) = substr(put(T2.K,14.),7,8);
quit;
   
proc sql ;
 create table X2 as
 select T1.* from T1,T2
 where substr(put(T1.J,14.),1,8) = substr(put(T2.K,14.),1,8)
    or substr(put(T1.J,14.),1,8) = substr(put(T2.K,14.),7,8);
quit;
 
proc sql ;
 create table X3 as
 select T1.* from T1,T2
 where int(T1.J/1e6) = int(T2.K/1e6)
    or int(T1.J/1e6) = mod(T2.K,1e8) ;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;NOTE: Table WORK.X1 created, with 9990000 rows and 3 columns.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;real time 5.67 seconds&lt;BR /&gt;user cpu time 5.28 seconds&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;NOTE: Table WORK.X2 created, with 9990000 rows and 3 columns.&lt;/P&gt;
&lt;P&gt;real time 6.08 seconds&lt;BR /&gt;user cpu time 5.89 seconds&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;NOTE: Table WORK.X3 created, with 9990000 rows and 3 columns.&lt;/P&gt;
&lt;P&gt;real time 4.40 seconds&lt;BR /&gt;user cpu time 4.20 seconds&lt;/P&gt;</description>
      <pubDate>Thu, 18 Apr 2019 03:38:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-to-Data-Step/m-p/551979#M153432</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2019-04-18T03:38:00Z</dc:date>
    </item>
    <item>
      <title>Re: Proc SQL to Data Step</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-to-Data-Step/m-p/552102#M153477</link>
      <description>Thank You ChrisNZ, it took seconds. I'm amazed. Now I just have to learn that to improve.&lt;BR /&gt;&lt;BR /&gt;Thank You</description>
      <pubDate>Thu, 18 Apr 2019 13:35:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Proc-SQL-to-Data-Step/m-p/552102#M153477</guid>
      <dc:creator>vpanzagarcia</dc:creator>
      <dc:date>2019-04-18T13:35:10Z</dc:date>
    </item>
  </channel>
</rss>

