BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
genelle_lake
Calcite | Level 5

Good day,

 

Due to a need to have data from several tables be available for querying by our business users, I created a view which left joins 10 tables and unions the results to perform the full dataset. The problem here is that the view underperforms in a significant way. In light of this I would be grateful if more sensible alternatives to my crazy view can be suggested that I can try them

 

Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions
kiranv_
Rhodochrosite | Level 12

Instead of view, I would rather create a table and run the table creating process daily or whenever it is required

View solution in original post

10 REPLIES 10
kiranv_
Rhodochrosite | Level 12

Instead of view, I would rather create a table and run the table creating process daily or whenever it is required

genelle_lake
Calcite | Level 5

Thanks for this... I dont know why I never thought to just pull the data into a table.  Its up and running and being evaluated as we speak.

 

Thank you.

SASKiwi
PROC Star

What are the reasons for doing an SQL view in the first place? If you replace it with a table and refresh it daily does that satisfy your requirements?

genelle_lake
Calcite | Level 5
When the idea first came about it was thought to be a small view but as the requirements were gathered in full and implemented it became one behemoth of a view. In its defense it did work well for a while before it slowed to a crawl.

I pulled the data into a table.  Its up and running and being evaluated as we speak.

Thank you for the suggestion..
Kurt_Bremser
Super User

Complex joins often have bad performance in SAS SQL. Depending on the structure of the joins (many-to-many vs. one-to-many), using a series of data and sort steps may be better suited for your task.

Only if you need a cartesian result in a many-to-many situation will SQL be unavoidable.

genelle_lake
Calcite | Level 5
I sure can attest to this. There were far too many joins. We are evaluating the use of a replica table instead of the view.
Patrick
Opal | Level 21

@genelle_lake

Like already suggested nothing will perform as good as replicating the data by implementing a new physical table.

 

If you don't want to replicate the data then what you could try:

 

Option 1: SQL View

- Create indexes on the columns used for joining.

 

- Should this be a case of adding dimensional information to a fact table then also try the ROLE= data set option

  http://support.sas.com/documentation/cdl/en/ledsoptsref/69751/HTML/default/viewer.htm#n1lwir1q8t4uj4...

  I'd assume this option instructs the SQL compiler do use a hash lookup and you eventually don't even need indexes anymore.

 

- And last but not least: I'd assume the main time in your SQL is spent for sorting. IF your lookup tables are not too big then you could increase the value for ubufsize (buffersize prior SAS 9.4). The bigger this value the more memory will be consumed but the higher the chance that your lookups will happen in memory (hash) and not physically by sorting.

http://support.sas.com/documentation/cdl/en/sqlproc/69822/HTML/default/viewer.htm#p12ohgh32ffm6un13s...

This quite dated but very interesting article explains how things work:  https://support.sas.com/techsup/technote/ts553.html 

 

 

Option 2: Datastep View

If you're joining a big table to multiple rather small lookup tables using different keys then you could also go for a data step view and use SAS data step hash lookups. This would avoid the need to sort the big table multiple times which could save quite a bit of time.

 

genelle_lake
Calcite | Level 5
Thank you for your very thorough and thoughtful response on this. A replica table has been built and is currently being evaluated.
sschleede
Obsidian | Level 7

When I left join tables, I try to always join to a common table. e.g.

select ...

from a

left join

b

on a.id = b.id

left join

c

on a.id = c.id

left join

d

on a.id = d.id

 

rather than this equally correct, but sometimes dangerous approach.

select ...

from a

left join

b

on a.id = b.id

left join

c

on b.id = c.id

left join

d

on c.id = d.id

 

I wasn't sure what you were doing with the Unions. I have often used a Common Table Expression to get all of the values I need for the join. e.g. (not sure I have the syntax exactly right here)

;with All_IDs as

(select ID from

(select distinct ID from a

union

select distinct ID from b

union

select distinct ID from c))

 

select ...

from

All_IDs

left join

a

on All_IDs.ID = a.ID

left join

b

on All_IDs.ID = b.ID

left join

c

on All_IDs.ID = c.ID

 

I also like to break down the query to its subcomponents to see which part is particularly running slow. This helps focus my optimization work.

genelle_lake
Calcite | Level 5
I'll have to take this approach into future use when I have joins to perform. It seems to be a moresensible version of my approach.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 10 replies
  • 1338 views
  • 2 likes
  • 6 in conversation