BookmarkSubscribeRSS Feed
vfarmak
Quartz | Level 8

 

Hi Everyone,

 

I am using Data Integration Studio and I need to write data to a sas dataset from an SAS View.

Despite the fact that I have aggregated all the values in the final step (SAS View), it takes a significant amount of time to write the data (even if they are still aggregated). Any suggestions?

 

 

6 REPLIES 6
s_lassen
Meteorite | Level 14

Sounds like it is the execution of the view that is taking the time. You say that you have done a lot of aggregation in the view (I assume it is an SQL view, not a data step view), but when you create the view, the data is not aggregated. That happens when you read the view, not before.

 

How to make a view execute faster? One possibility is to create indexes on the contributing tables. Another may be (if the input data is sorted in the correct order) to create a data step view instead of an SQL view. The third possibility is to create a table instead of a view - that will move the time spent on execution from the reading to the creating - if the input data to the view is not updated that often, but the view is often read, that may be a good solution.

vfarmak
Quartz | Level 8

Thank you for your tip!

I will try the datastep view, since I have not used it.

I mainly use views due to the fact that are stored in memory and not on disk.

What I am trying to achieve is leverage the Proc SQL Views to compute fast data and when the information is aggregated to store them. 

 

It seems that many things happen behind the scenes. I thought that all the information stored in the PDV.

Thank you!

Kurt_Bremser
Super User

@vfarmak wrote:

...

I mainly use views due to the fact that are stored in memory and not on disk.


This is a misconception. A view is stored on disk, like other library members. While it is executed when you use it, it has to use the same resources which would have been used if the code created a dataset. This means use of temporary files in WORK (and/or UTILLOC, if that is defined) as soon as a part of the transformation can not be done in memory alone.

More about UTILLOC is found here.

Patrick
Opal | Level 21

A SQL view is nothing else than encapsulated SQL that gets executed when you use the view. During the execution of the view there might still be a lot of sorting and read/write to temporary tables under utilloc.

It's a while that I've actually done something with DIS but I normally avoid using views because if anything falls over then you get the error only in the node that uses the view which makes debugging harder. 

I also observed that switching from a table to a view often doesn't significantly improve performance.

My recommendation would be: 

1. First define your target as table and not as view and performance tweak your SQL transformation (like if there are a lot of joins then eventually change the order of the joins like if there are joins using the same key then have them in consecutive order, only select the target columns you need (no select *), etc. etc ).

2. Only start switching tables to views at the end of developing a DIS flow and once you know how long the individual nodes and the flow as a whole execute with tables only. This allows you determine if switching to views actually improves performance. I'd only keep the views if the improvement is significant enough to outweigh the complications when it comes to error investigation). 

vfarmak
Quartz | Level 8

I mainly use indexes and try to influence SAS on what to pick up in terms of performance.

However I am not aware of the utilloc you mentioned. Do you have any material / url to know more details about it?

 

Thank you!

Vasilis

Patrick
Opal | Level 21

@vfarmak UTILLOC is just a session specific location on disk like WORK where SAS stores "temporary" files. If not defined specifically then UTILLOC uses the same location as WORK. Run Proc Options to see what's defined for your environment.

 

The point I tried to make: Creating a view doesn't change the need for SAS to actually process the data. It just only happens when you use the view and not when you create it.

 

For example Proc Sort or a SQL that needs to sort data (like for a join) uses UTILLOC for such files if the process can't run fully in-memory.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 658 views
  • 2 likes
  • 4 in conversation