Help using Base SAS procedures

Proc Sort - tagsort option

Reply
Occasional Contributor
Posts: 10

Proc Sort - tagsort option

HI,

I am trying to sort a very large dataset( aprrox 32G) on 4 variables. I tried using the Tagsort option but got the error meessage specifying

ERROR: TAGSORT option cannot be used with engines that do not support random access.

Any suggestions as to how can this error be avoided? Or how can I sort the dataset more efficiently.

Thanks,

Neha

Super User
Posts: 5,516

Re: Proc Sort - tagsort option

Why would you not have random access?  Is your data stored on tape (or on disk, but in tape format)?

If you can run this program, there is probably a solution:

data _null_;

   _n_=50;

   set have point=_n_;

   stop;

run;

SAS Press publishes a book on efficiency (author is Virgile).  The chapter on sorting contains a small section on how to program your own TAGSORT, but it requires point= as part of the solution.

A secondary question is why you need TAGSORT.  Maybe this should be the primary question.  Is the data set so large that you can't get the sort work space?  Are you hoping that TAGSORT will get the program to run faster?

Good luck.

Occasional Contributor
Posts: 10

Re: Proc Sort - tagsort option

Posted in reply to Astounding

Thanks for the reply but the dataset is indeed really large ( in the order of 30 gigs).

and the main objective behind using Tagsort is to save the workspace.

Super User
Posts: 5,516

Re: Proc Sort - tagsort option

If work space is the limiting factor, it will help to have other disk space you can use.  Split up the data.  For illustration, assume you have 90M observations:

proc sort data=have (firstobs=1 obs=10000000) out=out._1_;

   by four variables;

run;

proc sort data=have (firstobs=10000001 obs=20000000) out=out._2_;

   by four variables;

run;

...

proc sort data=have (firstobs=80000001) out=out._9_;

   by four variables;

run;

data want;

   set out._1_ out._2_ ... out._9_;

   by four variables;

run;

You may even find that this runs faster.  The SAS sorting mechanism has a small component that is proportional to the square of the number of observations.

If it is technically sound for your application (and if your operating system supports it), the NOEQUALS option may speed things up as well.

Good luck.

Trusted Advisor
Posts: 2,116

Re: Proc Sort - tagsort option

The point functionality that Astounding references also requires random access.  The TAGSORT will generally save space in performing the sort (it is not necessarily faster), but you will need to copy the data to a standard SAS engine first (I suspect you have a transport or EXPORT engine format dataset).

Doc Muhlbaier

Duke

Occasional Contributor
Posts: 10

Re: Proc Sort - tagsort option

Hey Duke,

I checked the engine format for my dataset and its V9Tape. How can I copy it to a standard V9 format?

Thanks.

Super Contributor
Posts: 394

Re: Proc Sort - tagsort option

Ask a Question
Discussion stats
  • 6 replies
  • 1388 views
  • 4 likes
  • 4 in conversation