Help using Base SAS procedures

Improve SQL Code

Reply
Frequent Contributor
Posts: 80

Improve SQL Code

Hi people,

What's the best way to improve this SQL code below for having a good performance?

Could be in Proc SQL or Data Step.

Any help will be greatly acceptable;

Each table has about 13.018.525 rows and 250 columns.

proc sql;

    create table preparacao as

    select a.DOSSIE,

           a.TDPRCPRODCOM,

           a.TDPRCPOSA,

           a.TDPRNAGCRCVT,

           a.JAN_CREAT,

           a.LMA,

           a.TDPRMFORIG,

           a.TDPRBFINORIG,

           a.CADEIA,

           a.atraso_atual_old,

           a.TDPRCPAIEMENT,

           a.SALDO_TOTAL,

           a.TDPRCVERROU,

           a.TALPCPSA,

           a.TDPRNQUANT,

           a.TCLINIDENT,

           b.atraso_atual_old as AT_H1,

           (b.SALDO_PAG/b.SALDO_MES) as PCT_PAG_H1,

           b.SALDO_TOTAL as SALDO_H1,

           b.TODU_PFRA as TODU_PFRA_H1,          

           c.atraso_atual_old as AT_H2,

           (c.SALDO_PAG/c.SALDO_MES) as PCT_PAG_H2,

           c.SALDO_TOTAL as SALDO_H2,

           c.TODU_PFRA as TODU_PFRA_H2,

           d.atraso_atual_old as AT_H3,

           (d.SALDO_PAG/d.SALDO_MES) as PCT_PAG_H3,

           d.SALDO_TOTAL as SALDO_H3,

           d.TODU_PFRA as TODU_PFRA_H3,

           sum(b.RECEITA,-c.RECEITA) as RECEITA_H1,

           sum(c.RECEITA,-d.RECEITA) as RECEITA_H2,

           sum(d.RECEITA,-e.RECEITA) as RECEITA_H3,

     i.mtrotmes,

           i.mail,

           i.sms,

           i.tlmkt,

           i.TCLILNOMPREN,

           i.TCLFNDDDP,

           i.TCLFNTELP,

           i.TCLINDDDEMP,

           i.TCLINTELEMP,

           i.TCLINDDDCEL,

           i.TCLINTELCEL,

           i.TVDCHAINE,

           i.tdprnsoc,

           i.talpcposdosa,

           i.tdcpcficp,

           i.d_cre,

           i.TDCPBMONTSSR12M,

           i.D_PCP

    from nbdos.base_portmanagement_201203 as a   left join nbdos.base_portmanagement_201202   as b on a.DOSSIE=b.DOSSIE

                                                 left join nbdos.base_portmanagement_201201   as c on a.DOSSIE=c.DOSSIE

                                                 left join nbdos.base_portmanagement_201112   as d on a.DOSSIE=d.DOSSIE

                                                 left join nbdos.base_portmanagement_201111   as e on a.DOSSIE=e.DOSSIE

                                                 left join baseseg_201204                                 as i on a.DOSSIE=i.DOSSIE

;

    quit;

Thanks.

Super User
Posts: 17,801

Re: Improve SQL Code

Your code looks pretty straight forward to me. I'm assuming you're having performance issues of some sort?

If so my guess would be your last join.

That looks like you're joining to a work table rather than one stored on a library (possibly server) tables and then you'll have performance issues. 

Is that what you're situation?

Frequent Contributor
Posts: 80

Re: Improve SQL Code

Hi Reeza, thanks for your response.

The library nbdos is permanent.

In fact I'm just trying to get this step better.

The problem is, i'm joining 6 tables to get some variables from each one of them by proc sql that I guess is not the best way for this accomplish

and moreover left join makes sort firstly and consume alot of cpu resources, despite merge would need to sort too.

Frequent Contributor
Posts: 110

Re: Improve SQL Code

did you try hash joins?

Frequent Contributor
Posts: 80

Re: Improve SQL Code

Hi sassharp,

I don't know if hash would it be the best solution for this because as i mentioned before each table has about 13.018.525 rows.

PROC Star
Posts: 1,561

Re: Improve SQL Code

It doesn't matter that there are 13 million rows, you only use a few variables from the look-up tables, so the memory footprint won't be huge (depending on how long the variables are). Use option fullstimer to gauge memory usage.

In my quick test, 13m rows use 500MB of RAM per table, so with 3GB of RAM you can load all 6 hash tables in one go.

At worse, you can do the hash look-ups in successive data steps (especially for table i where you fetch more variables), it will still be way faster that sorting.

Respected Advisor
Posts: 4,644

Re: Improve SQL Code

Adding to Reeza's question. Please tell us for each table :

  1. Is it local or remote?
  2. Is it managed by a server?
  3. Does it have an index defined on DOSSIE?

PG

PG
Frequent Contributor
Posts: 80

Re: Improve SQL Code

Hi PG,

Here are the responses:

1 - It's remote.

2 - It's managed by a server.

3 - Yes, all table have index defined on DOSSIE.

Augusto Souza.


Respected Advisor
Posts: 4,644

Re: Improve SQL Code

Expanding on question 2 : What kind of server? File or database? Could the query be processed at the server level? - PG

PG
Super User
Posts: 17,801

Re: Improve SQL Code

Its all on a server you say, but your last table was from a work directory?

How is your performance when you exclude that last table?

If all tables are on the server and your work table is in your work library SAS has to download all the files before you can do anything on it.

Respected Advisor
Posts: 4,644

Re: Improve SQL Code

SAS-SQL might not be able to farm out the 2-parameter MAX functions either...

PG

PG
Super User
Posts: 17,801

Re: Improve SQL Code

do you mean sum? perhaps not...but those could be rewritten to be SQL compliant easily enough.

Super User
Posts: 17,801

Re: Improve SQL Code

Also set

Option msglevel=i;

And run your query to see if the indexes are being used, it'll indicate this in the log.

Ask a Question
Discussion stats
  • 12 replies
  • 341 views
  • 0 likes
  • 5 in conversation