Good day,
Due to a need to have data from several tables be available for querying by our business users, I created a view which left joins 10 tables and unions the results to perform the full dataset. The problem here is that the view underperforms in a significant way. In light of this I would be grateful if more sensible alternatives to my crazy view can be suggested that I can try them
Thanks.
Instead of view, I would rather create a table and run the table creating process daily or whenever it is required
Instead of view, I would rather create a table and run the table creating process daily or whenever it is required
Thanks for this... I dont know why I never thought to just pull the data into a table. Its up and running and being evaluated as we speak.
Thank you.
What are the reasons for doing an SQL view in the first place? If you replace it with a table and refresh it daily does that satisfy your requirements?
Complex joins often have bad performance in SAS SQL. Depending on the structure of the joins (many-to-many vs. one-to-many), using a series of data and sort steps may be better suited for your task.
Only if you need a cartesian result in a many-to-many situation will SQL be unavoidable.
Like already suggested nothing will perform as good as replicating the data by implementing a new physical table.
If you don't want to replicate the data then what you could try:
Option 1: SQL View
- Create indexes on the columns used for joining.
- Should this be a case of adding dimensional information to a fact table then also try the ROLE= data set option
I'd assume this option instructs the SQL compiler do use a hash lookup and you eventually don't even need indexes anymore.
- And last but not least: I'd assume the main time in your SQL is spent for sorting. IF your lookup tables are not too big then you could increase the value for ubufsize (buffersize prior SAS 9.4). The bigger this value the more memory will be consumed but the higher the chance that your lookups will happen in memory (hash) and not physically by sorting.
This quite dated but very interesting article explains how things work: https://support.sas.com/techsup/technote/ts553.html
Option 2: Datastep View
If you're joining a big table to multiple rather small lookup tables using different keys then you could also go for a data step view and use SAS data step hash lookups. This would avoid the need to sort the big table multiple times which could save quite a bit of time.
When I left join tables, I try to always join to a common table. e.g.
select ...
from a
left join
b
on a.id = b.id
left join
c
on a.id = c.id
left join
d
on a.id = d.id
rather than this equally correct, but sometimes dangerous approach.
select ...
from a
left join
b
on a.id = b.id
left join
c
on b.id = c.id
left join
d
on c.id = d.id
I wasn't sure what you were doing with the Unions. I have often used a Common Table Expression to get all of the values I need for the join. e.g. (not sure I have the syntax exactly right here)
;with All_IDs as
(select ID from
(select distinct ID from a
union
select distinct ID from b
union
select distinct ID from c))
select ...
from
All_IDs
left join
a
on All_IDs.ID = a.ID
left join
b
on All_IDs.ID = b.ID
left join
c
on All_IDs.ID = c.ID
I also like to break down the query to its subcomponents to see which part is particularly running slow. This helps focus my optimization work.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.