Help using Base SAS procedures

Proc Sort Data set size limit w/ Windows 7

Reply
N/A
Posts: 0

Proc Sort Data set size limit w/ Windows 7

We recently deployed a new set of SAS workstations and are now running into limits as to how large of a data set we can sort. The limit is data set size not row count and appears to be between 18 and 20 GB. We had no problem sorting these same files on our older 32-bit XP environment.

The new systems are 64-bit server class machines with Windows 7 Professional installed, 4-CPU's, 32GB RAM, and have 150GB of dedicated SAS work space on a RAID array. We are running SAS 9.2 (TS2M2).

When sorting files that are below the threshold where we get failures the sorting is blazing fast. We can break the files up and ultimately get them sorted but I would prefer to fix the root cause of the issue.

A sample of the errors from the logs are:
ERROR: Failure while attempting to write page 82 of sorted run 637.
ERROR: Failure while attempting to write page 526690 to utility file 1.
ERROR: Failure encountered while creating initial set of sorted runs.
ERROR: Failure encountered during external sort.
ERROR: Sort execution failure.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 205470721 observations read from the data set LPSERV.TEST1.
WARNING: The data set LPSERV.TEST1 may be incomplete. When this step was stopped there were 0
observations and 21 variables.
WARNING: Data set LPSERV.TEST1 was not replaced because this step was stopped.
NOTE: PROCEDURE SORT used (Total process time):
real time 26:17.38
cpu time 14:44.46
Trusted Advisor
Posts: 2,113

Re: Proc Sort Data set size limit w/ Windows 7

First, are you running a supported version of SAS? Just saying TS2M2 isn't enough.
http://support.sas.com/kb/34/569.html
If so, this is probably worth a call to tech support.
Occasional Contributor
Posts: 6

Re: Proc Sort Data set size limit w/ Windows 7

[ Edited ]

PROC SORT with NODUPKEY will always return the physical first record - ie, as you list the data, c=71will be kept always. PROC SQL will not necessarily return any particular record; you could ask for minor max, but you could not guarantee the first record in sort order regardless of how you did the query; SQL will often resort the data as needed to accomplish the query as efficiently as possible.

They will be identical insomuch as they both return the same number of records, if that is your concern.

You cannot accomplish exactly the same thing in a straightforward manner in SQL; because SQL doesn't have a concept of row ordering, you would have to either have a method of choosing which c (max(c), min(c), etc.) or you would have to add a row counter and choose the lowest value of that.

For example:

data work.dataset;

input a b c;

rowcounter=_n_;

datalines;

27 93 71

27 93 72

46 68 75

55 55 33

46 68 68

34 34 32

45 67 88

56 75 22

34 34 32

;

run;

 

proc sql;

select a,b,min(rowcounter*100+c)-min(rowcounter*100) as c

from work.dataset

group by a,b;

quit;

That's using a cheat (knowing that rowcounter*100 will always dominate the size of c); of course if your c doesn't have values appropriate for that, this won't work and you're better off merging it on separately.

If you are interested in the SQL solution, you may consider posting that explicitly as a separate question as the SQL-only folk will then answer it.

 

 

Super User
Posts: 5,256

Re: Proc Sort Data set size limit w/ Windows 7

I think you are a candidate for a server based architecture. Gives you more power, easier to maintain and encouraging cooperation.
If that's not possible, start with an upgrade, shouldn't be that hard with stand alone installationso (and stand alone data?).
Data never sleeps
Ask a Question
Discussion stats
  • 3 replies
  • 428 views
  • 0 likes
  • 4 in conversation