Re: PLZ: Macro Urgent Help

ChrisNZ · Posted 07-07-2009 06:09 PM

Daniel,
Re. views, you've piqued my interest. I'll start a new thread.

High-Performance SAS Coding - Third Edition

ChrisNZ · Posted 07-07-2009 11:28 PM

Probably not worth starting a new thread, though this is bit off-topic.

I've conducted a few tests.

[pre]data SASUSER.C;
do I=1 to 2e7;
A=ranuni(0);
B=round(A,.1);
C1=B; C2=B;
output;
end;
run;

%let s=%sysfunc(time());
signon task1 sascmd="!sascmd";
rsubmit task1 wait=no;
libname OUTLIB sasesock ":irc" timeout=5000;
proc sort data=SASUSER.C out=OUTLIB.C1;
by B A;
run;
endrsubmit;
rdisplay task1;
signon task2 sascmd="!sascmd";
rsubmit task2 wait=no;
libname INLIB sasesock ":irc" timeout=5000;
data D1;
set INLIB.C1;
by B;
if last.B;
run;
endrsubmit;
rdisplay task2;
waitfor task2;
%put time=%sysevalf(%sysfunc(time())-&s); ***** MP CONNECT: 130s median time for 3 runs;

proc sort data=SASUSER.C out=C1;
by B ;
data D;
retain _A _C1 _C2;
drop _:;
set C1;
by B;
if _A lt A then do;
_A=A; _C1=C1; _C2=C2;
end;
if last.B;
A=_A; C1=_C1; C2=_C2;
run; ******** SORT TO TABLE: 140s median time for 3 runs;

proc sql;
create view V as select * from SASUSER.C order by B, A;
data D1;
set V;
by B;
if last.B;
run; ******** SORT TO VIEW: 150s median time for 3 runs;
[/pre]

If I make the table wider (C2 as $50), the time differences get larger:
MPC 185s
TABLE 350s
VIEW 450s

So the view is indeed slower, probably due to using SQL rather than proc sort, as you mention.
I'd still use that syntax for small tables as it is a lot easier to read.

The most difficult code to read is MPC, but it is also the fastest, saving on I/O. I hope sas will provide local multithreaded syntax soon to do this kind of processing, rather than having to remote submit. It would make the code a lot cleaner.

Lastly, in this case we can also use
[pre]
proc summary data=SASUSER.C nway noprint ;
class B;
var A;
output out=t max= maxid(A(c1))=c1 maxid(A(c2))=c2;
run;
[/pre]
which runs in a mere 23s for the wide table, so is by very far the most efficient.

High-Performance SAS Coding - Third Edition

DanielSantos · Posted 07-09-2009 07:25 AM

Hello Chris.

Very good analysis the one you provided.

Although very off the original topic, this is getting very interesting.
I find it very nice and valuable to have some more academic discussion here.

I've ran your tests (3 runs) on my machine (AIX box with 12 lcpu and 20GB of RAM) and added 4 more variants to the solution (code posted at the end):

+ Summary Hash technique.
+ SORT(one var only)+DATASTEP with Piping Parallelism
+ MP(4 parts) SORT(one var only)+DATASTEP with Piping Parallelism
+ MP(4 parts) SORT(one var only)+DATASTEP with no Piping Parallelism

Results for the same data of your trials:

SORT(1VAR)+STEP time=32s, 32s, 32s
SQL(2VAR)+STEP time=58s, 57s, 60s
PIPED SORT(2VAR)+STEP time=31s, 30s, 31s
SUMMARY time=5s, 5s, 5s

PIPED SORT(1VAR)+STEP time=26s, 27s, 27s
PIPED/MP 4xSORT(1VAR)+STEP time=24s, 25s, 24s
HASH SUM time=29s, 30s, 32s
MP 4xSORT(1VAR)+STEP time=23s, 25s, 24s

Conclusion.
SQL worst.
Hash not brilliant but, keeps up with the average.
Piped Sort by 1 var does a better job than by 2.
MP(4 parts) good, but pretty much the same with or without piping.
And the gold medal goes for Summary by far.

Now, beside the amount of data, two more factors should be considered.

Type of data - the generated sample produces only 7 distinct values for B.
System load - some techniques (MP/piping and MP+piping) could be greatly influenced by the system load at execution.

So I slightly modified your sample to produce about 1000 distinct values for B.
Then I ran the same tests (3 runs), and here are the results:

SORT(1VAR)+STEP time=59s, 40s, 41s
SQL(2VAR)+STEP time=96s, 68s, 89s
PIPED SORT(2VAR)+STEP time=215s, 39s, 39s
SUMMARY time=63s, 59s, 70s

PIPED SORT(1VAR)+STEP time=45s, 44s, 49s
PIPED/MP 4xSORT(1VAR)+STEP time=37s, 37s, 68s
HASH SUM time=49s, 38s, 36s
MP 4xSORT(1VAR)+STEP time=32s, 37s, 177s

Conclusion.
SQL still worst.
Hash not brilliant but, again keeps up with the average.
MP/Piping processing produces the best marks, but are highly dependent on the system load, and so less stable in performance.
Summary has a lot of more to handle, and goes down on the ranking, very near to SQL performance.

So about summing data with SAS, I would conclude by this results, that there is no particular recommended technique. Still, avoid SQL unless for small amount of data.
Hash, not being the best, will assure you a good performance, but careful with the memory limitations (roughly I would say a limit of 2,000,000 of distinct elements would be reasonable).
Then we have PROC SUMMARY/MEANS which on v9 with its threading support as got a lot better, but, be aware that performance will dramatically reduce with the increase of data, and particularly with the number of distinct elements your grouping.
Multi-Processing with or without Piping Parallism, pushes away your processing power, but are alway very dependent on your system load, and results may vary from best to worst.

Chris, thank you for pushing this subject a little further, I find it great and mentally challenging to have from time to time this kind of discussion.

Cheers from Portugal.

Daniel Santos @ www.cgd.pt.

DanielSantos · Posted 07-09-2009 07:31 AM

Now the code.

Hash Sum:

data _null_;
declare hash HT(hashexp: 16);
HT.defineKey('B');
HT.defineData('B', 'A_MAX', 'C1_MAX', 'C2_MAX');
HT.defineDone();
do until (_EOF);
set LIB.C end = _EOF;
if HT.find() or A gt A_MAX then do;
A_MAX=A; C1_MAX=C1; C2_MAX=C2;
end;
HT.replace();
end;
_RC=HT.output(dataset: 'LIB.D4');
run;

Proc Sort (by one var) and datastep with piping parallelism:

signon task1 sascmd="!sascmd";
rsubmit task1 wait=no;
libname LIB '/batch/sas/SASTMP';
libname OUTLIB sasesock ":9981" timeout=5000;
proc sort data=LIB.C out=OUTLIB.C5;
by B;
run;
endrsubmit;
signon task2 sascmd="!sascmd";
rsubmit task2 wait=no;
libname LIB '/batch/sas/SASTMP';
libname INLIB sasesock ":9981" timeout=5000;
data LIB.D5;
retain _A _C1 _C2;
drop _:;
set INLIB.C5;
by B;
if _A lt A then do;
_A=A; _C1=C1; _C2=C2;
end;
if last.B;
A=_A; C1=_C1; C2=_C2;
run;
endrsubmit;
waitfor task2;
rget task1;
rget task2;

MP (4 parts) with Proc Sort (by one var) and datastep with piping parallelism:

signon task1_1 sascmd="!sascmd";
rsubmit task1_1 wait=no;
libname LIB '/batch/sas/SASTMP';
libname OUTLIB sasesock ":9971" timeout=5000;
proc sort data=LIB.C (firstobs=1 obs=5000000) out=OUTLIB.C7_1;
by B;
run;
endrsubmit;
signon task1_2 sascmd="!sascmd";
rsubmit task1_2 wait=no;
libname LIB '/batch/sas/SASTMP';
libname OUTLIB sasesock ":9972" timeout=5000;
proc sort data=LIB.C (firstobs=5000001 obs=10000000) out=OUTLIB.C7_2;
by B;
run;
endrsubmit;
signon task1_3 sascmd="!sascmd";
rsubmit task1_3 wait=no;
libname LIB '/batch/sas/SASTMP';
libname OUTLIB sasesock ":9973" timeout=5000;
proc sort data=LIB.C (firstobs=10000001 obs=15000000) out=OUTLIB.C7_3;
by B;
run;
endrsubmit;
signon task1_4 sascmd="!sascmd";
rsubmit task1_4 wait=no;
libname LIB '/batch/sas/SASTMP';
libname OUTLIB sasesock ":9974" timeout=5000;
proc sort data=LIB.C (firstobs=15000001 obs=20000000) out=OUTLIB.C7_4;
by B;
run;
endrsubmit;
signon task2 sascmd="!sascmd";
rsubmit task2 wait=no;
libname LIB '/batch/sas/SASTMP';
libname INLIB1 sasesock ":9971" timeout=5000;
libname INLIB2 sasesock ":9972" timeout=5000;
libname INLIB3 sasesock ":9973" timeout=5000;
libname INLIB4 sasesock ":9974" timeout=5000;
data LIB.D5;
retain _A _C1 _C2;
drop _:;
set INLIB1.C7_1 INLIB2.C7_2 INLIB3.C7_3 INLIB4.C7_4;
by B;
if _A lt A then do;
_A=A; _C1=C1; _C2=C2;
end;
if last.B;
A=_A; C1=_C1; C2=_C2;
run;
endrsubmit;
waitfor task2;
rget task1_1;
rget task1_2;
rget task1_3;
rget task1_4;
rget task2;

Finally, MP (4 Parts) Proc Sort (by one var) and datastep without piping parallelism:

signon task1_1 sascmd="!sascmd";
rsubmit task1_1 wait=no;
libname LIB '/batch/sas/SASTMP';
proc sort data=LIB.C (firstobs=1 obs=5000000) out=LIB.C7_1;
by B;
run;
endrsubmit;
signon task1_2 sascmd="!sascmd";
rsubmit task1_2 wait=no;
libname LIB '/batch/sas/SASTMP';
proc sort data=LIB.C (firstobs=5000001 obs=10000000) out=LIB.C7_2;
by B;
run;
endrsubmit;
signon task1_3 sascmd="!sascmd";
rsubmit task1_3 wait=no;
libname LIB '/batch/sas/SASTMP';
proc sort data=LIB.C (firstobs=10000001 obs=15000000) out=LIB.C7_3;
by B;
run;
endrsubmit;
signon task1_4 sascmd="!sascmd";
rsubmit task1_4 wait=no;
libname LIB '/batch/sas/SASTMP';
proc sort data=LIB.C (firstobs=15000001 obs=20000000) out=LIB.C7_4;
by B;
run;
endrsubmit;
*rdisplay task1;
waitfor _all_ task1_1 task1_2 task1_3 task1_4;
signon task2 sascmd="!sascmd";
rsubmit task2 wait=no;
libname LIB '/batch/sas/SASTMP';
data LIB.D7;
retain _A _C1 _C2;
drop _:;
set LIB.C7_1 LIB.C7_2 LIB.C7_3 LIB.C7_4;
by B;
if _A lt A then do;
_A=A; _C1=C1; _C2=C2;
end;
if last.B;
A=_A; C1=_C1; C2=_C2;
run;
endrsubmit;
waitfor task2;
rget task1_1;
rget task1_2;
rget task1_3;
rget task1_4;
rget task2;

Cheers from Portugal.

Daniel Santos @ www.cgd.pt.

ChrisNZ · Posted 07-09-2009 05:34 PM

Mmm, interesting. I woud never have thought of splitting the sort.
This is probably the wrong place to prolong this discussion, though.
As usual when discussing performance, the answer is: it depends on the data and what you do with it, which makes sizing the right box for a data warehouse so difficult.

High-Performance SAS Coding - Third Edition

Bhoopesh · Posted 11-10-2012 05:34 PM

Hi Daniel,

I have tested the above technique:

Finally, MP (4 Parts) Proc Sort (by one var) and datastep without piping parallelism:

Only first dataset is getting sorted which contains obs from 1 to 50000000 C7_1.

Other datasets are not getting sorted by MP4. Can you expalin it why..?

Even in the log it says that 50000000 observations are sorted but when i combine C7_1 to C7_4. Only C7_1 is sorted, is there anything i need to modify in the code.

Thanks in Advance...