Hi,
I am using views in some data steps, in order to reduce I/O operations and improve performance when using large datasets.
I have tested this in some programs, in my program I have a result datasets and other than are intermediate (stored in work library).
I am changing the intermediate dataset for a view and I have found that if I change this intermediate dataset for a view in one or two steps then the performance is better, but if I made more than two steps with views then the performance is equal than using datasets or even worse. In both situations I create a results dataset (nor view) in the final step.
I would like to know best practices of using views, if there are good noly for one or tow steps or for some type of operations. I am doing data /set with new fileds, joins, agregations.
Thanks in advance
If you re-use a data set view, then it is reconstructed each time from the original source, which is probably why you don't see expected benefits sometimes when using the view multiple times. This would be especially true when the view is a merge of multiple datasets, or the view is a small subset of the original.
One way to mitigate this problem is to simultaneously define a matching data set view (VNEED) and a data set file (NEED), as below. The first use (proc freq) call VNEED, which in the background would read BIGDATA, stream in VNEED, and write NEED. Note the proc freq doesn't have to wait for dataset NEED to be completely generated. The subsequent univariate proc would read dataset file NEED.
data need vneed / view=vneed;
set bigdata;
where ....;
....
run;
proc freq data=vneed;
tables .... ;
run;
proc univariate data=need;
....
run;
So depending on the relative size of bigdata vs need, you might save considerable time by using the data file for 2nd and further access, and use the view only for the first access.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.