BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SASNE
Obsidian | Level 7

Hi Gurus,

 

In oracle and sqlserver environment, after extract tables with where (FACT Table filter out unwant data) as SAS7DAT in Work folder 

 

then If i have a 8 SAS7dat,  A B C D E F G H , A = E ( Fact Table , same columns and data type , only different is data)

                                                              B = F (Ref table, same columns and data type , only different is data)

                                                              C = G (Ref table, same columns and data type , only different is data)

                                                              D = H (Ref table, same columns and data type , only different is data)

 

Would you rather   

(i) 

                             A join B join C join D 

                               E join F join G join H   first,

                              then union join both dataset                       or

(ii)

                            A Union E,  B Union F , C union G, D union H then

                            AE Join BF join CG join DH

 

when considering performance issue 

  

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Shmuel
Garnet | Level 18

If originally you have 4 tables, why did you cut them into 2 halves ?

If you can import each of them without splitting them you will save more I/O.

 

When you join any two tables you read and write same amount of data, that is 2*(x+y);

When you append  table X to table Y you read and write X only, that is 2*x;

 

So, to answer your question - YES, I beleive it is right.

View solution in original post

4 REPLIES 4
Shmuel
Garnet | Level 18

1) You may find interest in the next link:

http://www2.sas.com/proceedings/sugi29/064-29.pdf

 

2) Is it too heavy to run both methods and compare CPU time and elapse time ?

 

3) Do you need to join tables by key (ID) or just append data and merge variables  ?

     If possible then first append data of same format using proc append then 

     use either sort + merge or use sql to join them.

     Better append the smaller table to the big one of same format, in order to save I/O,

     unless you need keep the original 8 tables as they are.

     

SASNE
Obsidian | Level 7

Hi Shmuel,

 

Yes, in other word, i have 4 tables but i cut down half data in each table and create another tables, so 8 tables in total. so the columns and data type are identical  as below A = E, B =F, C=G and D=H

 

A B C D

 

E F G H

 

 

So you suggest to append AE , BF CH and DH first then join them together Using Key  later on  becase of IO reading time ?

 

Thanks

Shmuel
Garnet | Level 18

If originally you have 4 tables, why did you cut them into 2 halves ?

If you can import each of them without splitting them you will save more I/O.

 

When you join any two tables you read and write same amount of data, that is 2*(x+y);

When you append  table X to table Y you read and write X only, that is 2*x;

 

So, to answer your question - YES, I beleive it is right.

DanielSantos
Barite | Level 11

 

When it comes to performance of merging data it cannot be reduced just to the sequence of joins.

 

If that's just what you are after, then the answer is pretty much: set the fullstimer option, run both several time and see what's best.

 

If you really want to go deep in this, then why not consolidate all the data inside the database?

 

Data is stored and retrieved in a much more efficient way inside the Database.

 

If for some reason that's not possible, then please describe:

Each join type and keys, size (obs, record lenght) of each dataset.

 

Daniel Santos @ www.cgd.pt

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1114 views
  • 0 likes
  • 3 in conversation