BookmarkSubscribeRSS Feed
ChrisNZ
Tourmaline | Level 20
We know that views can be faster for sequential processing as they potentially avoid having to write and read intermediate tables on disk.

In this case, where I merge (so reads are concurrent) many tables, views are also faster. Anyone knows why?
[pre]
%macro t;

%do i=1 %to 30; * create input tables and views;
data T&i;
retain X1-X150 1 ;
do I=1 to 100000;
output;
end;
run;
data _T&i/view=_T&i;
set T&i(keep=X1-X9 I);
run;
%end;


data TESTA; * merge views: 15s/30s/7mn;
merge %do i=1 %to 30;
_T&i(rename=(X1=X1_&i X2=X2_&i X3=X3_&i X4=X4_&i X5=X5_&i X6=X6_&i X7=X7_&i X8=X8_&i X9=X9_&i))
%end;
;by I;
run;


data TESTB; * merge tables: 25s/45s/12mn;
merge %do i=1 %to 30;
T&i(keep=X1-X9 I
rename=(X1=X1_&i X2=X2_&i X3=X3_&i X4=X4_&i X5=X5_&i X6=X6_&i X7=X7_&i X8=X8_&i X9=X9_&i))
%end;
;by I;
run;

%mend; %t


[/pre]

The 3 times are on different PCs with varying disk setups (beware: faster times on RAID0 4x15k-disk array, slowest on desktop PC).
PS: Don't even think of trying this with a SQL full (outer) join, or do it over your lunch break!
10 REPLIES 10
ChrisNZ
Tourmaline | Level 20
Just putting this in the fore once as a last attempt for a comment.
Has anyone replicated this behaviour?
No idea why views would be faster?

2 explanations I can think of:

- When using views, input tables are read sequentially.
When using tables, all tables are read concurrently one record at a time (which would thrash the disks).
But then where is the data stored? Much higher memory usage with views is in no way sufficient to store all the data.
Maybe tables are just read in much larger blocks when using views?

- The keep statement is not used as efficiently when using tables, and full observations are read, even if not stored in the limited PDV.

I added the option bufno=1k when reading tables T&i and times decreased dramatically (but memory usage stayed the same somehow).
So it looks like the 1st idea might be right: views use larger blocks.

Any comment?
Oleg_L
Obsidian | Level 7
Chris,
if we move keep option from data step that create views to step that merge views then testa takes much longer time on my PC:

NOTE: The data set WORK.TESTA has 100000 observations and 271 variables.
NOTE: Compressing data set WORK.TESTA decreased size by 49.95 percent.
Compressed is 7151 pages; un-compressed would require 14288 pages.
NOTE: DATA statement used (Total process time):
real time 13:19.97
cpu time 29.10 seconds

NOTE: The data set WORK.TESTB has 100000 observations and 271 variables.
NOTE: Compressing data set WORK.TESTB decreased size by 49.95 percent.
Compressed is 7151 pages; un-compressed would require 14288 pages.
NOTE: DATA statement used (Total process time):
real time 6:12.93
cpu time 19.17 seconds

So, I think that Your 2nd explanation is right.
DanielSantos
Barite | Level 11
Hello Chris.

Somehow I missed this very interesting post.

Will submit your code on my AIX box...

Cheers from Portugal.

Daniel Santos @ www.cgd.pt
DanielSantos
Barite | Level 11
Done.

Although the merge with views executed on less time, performance is very identical...

Merge with views:

NOTE: DATA statement used (Total process time):
real time 16:24.78
user cpu time 41.00 seconds
system cpu time 32.28 seconds
Memory 22663k
Page Faults 185862
Page Reclaims 15291
Page Swaps 0
Voluntary Context Switches 9793
Involuntary Context Switches 6277
Block Input Operations 0
Block Output Operations 0

Merge with tables:

NOTE: DATA statement used (Total process time):
real time 16:29.87
user cpu time 37.82 seconds
system cpu time 29.61 seconds
Memory 21024k
Page Faults 185651
Page Reclaims 8430
Page Swaps 0
Voluntary Context Switches 9780
Involuntary Context Switches 8154
Block Input Operations 0
Block Output Operations 0

- Page faults are about the same (I/O) but page reclaims (done in memory) are much higher (almost double) for the merge/views, which may explain the 5 seconds difference.
- Memory was about the same in both techniques.
- cpu usage was a little less with merge/tables, which is reasonable, since views should have some processing overhead to map/resolve the logical definition to the physical data.

But your theory may be actually right (views by blocks vs tables by rows), the page reclaims difference may indicate that. Data read into memory with a larger block size (and retrieved from there) vs. data read into memory with a smaller block size, and much more I/O operations.
Anyway, as always for performance, it will be everytime a system dependent matter. Base on the above theory and for our system, although merge/views performed less I/O operations the total processing time remained pretty much the same.

Cheers from Portugal.

Daniel Santos @ www.cgd.pt.
DanielSantos
Barite | Level 11
Oh, by the way. Never had stopped to think about this subject, thanks for bringing that up!

Cheers from Portugal.

Daniel Santos @ www.cgd.pt
ChrisNZ
Tourmaline | Level 20
Thanks for your input guys, much appreciated.
Nice machine you have,Daniel!
Opa4ki
Calcite | Level 5
Solaris 10. Quite strong production box. I didn't get the thing. Tables are faster.

NOTE: The data set WORK.TESTA has 100000 observations and 271 variables.
NOTE: DATA statement used (Total process time):
real time 49.00 seconds
user cpu time 17.17 seconds
system cpu time 29.58 seconds
Memory 12129k
Page Faults 7
Page Reclaims 0
Page Swaps 0
Voluntary Context Switches 814
Involuntary Context Switches 713
Block Input Operations 1
Block Output Operations 1

NOTE: The data set WORK.TESTB has 100000 observations and 271 variables.
NOTE: DATA statement used (Total process time):
real time 52.00 seconds
user cpu time 16.01 seconds
system cpu time 35.29 seconds
Memory 10482k
Page Faults 0
Page Reclaims 0
Page Swaps 0
Voluntary Context Switches 877
Involuntary Context Switches 849
Block Input Operations 0
Block Output Operations 1
DanielSantos
Barite | Level 11
In your case, I would say that it is a matter of "too much" power. 🙂

Maybe you could escalate Chris's code to suit your platform.

And could please share your hardware setup? Nice system you have there.

Cheers from Portugal.

Daniel Santos @ www.cgd.pt
Opa4ki
Calcite | Level 5
uname -a
SunOS two 5.10 Generic_138888-08 sun4u sparc SUNW,Sun-Fire-15000

20 sparcv9 processors which go in 40 virtual processors with 1800 MHz
vmstat shows ~120G free swap and 110G free physical memory.

DWH ETL in telecom

About the tests. I don't think i'm going to spend time on something meaningful in this way in near time.
ChrisNZ
Tourmaline | Level 20
It looks like this is a Windows platform behaviour then, probably due to different sas buffer defaults for views and tables.

It is well worth knowing about this in any case: one process was reduced from 2 hours to 10 min.

I am very curious as to what changes might happen in the sas engine for such dramatic improvements to take place.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1322 views
  • 0 likes
  • 4 in conversation