tagsort option for proc sql

Reply
Regular Contributor
Posts: 161

tagsort option for proc sql

I have a large dataset that requires sorting while using proc sql.  SAS run out of disk space always.  If I use proc sort tagsort, things are much easier.  But is there such an option in the proc sql to force sas use tagsort?  Also, can I sort before using proc sql so that proc sql won't have to do sorting again?  Thanks!

Super User
Super User
Posts: 7,039

Re: tagsort option for proc sql

Posted in reply to caveman529

Could you create an index on the BIG dataset?

Regular Contributor
Posts: 161

Re: tagsort option for proc sql

Hi, Tom.  Thank you for your reply!

I summarize the data using proc sql by different "level" of data depending on situation.  For example, I have the following variables in my dataset

key1 key2 key3 key4 var1 var2 var3 var4

I sometimes have to group and sort by key1 and key2.  Sometimes, I have to group and sort by key3 and key4.  Sometimes, just by key1. 

Does creating index with key1 to key4 help in this situation.  Is proc sql intelligent enough to know that there is an index it can use?  Thank you!

I also heard that proc summary doesn't require sorting to do the job.  Am I correct on this?  Thanks!

Super User
Super User
Posts: 7,039

Re: tagsort option for proc sql

Posted in reply to caveman529

PROC SUMMARY does not require that CLASS variables be sorted.  But it will require enough memory to store all of the bins.

proc summary data=have nway ;

  class key1 key2 ;

  var var1 var2 ;

  output out=want sum= ;

run;

will create the sums of VAR1 and VAR2 for all combinations of KEY1 crossed with KEY2 .

You can also create many other combinations of the class variables.

Regular Contributor
Posts: 161

Re: tagsort option for proc sql

I plan to summarize the data at different classes.  Is creating an index for all the key variables useful?  For example, I have:

key1 key2 key3    var1 var2 var3

I use proc sql to summarize by key1 sometimes and by key1 and key3 sometime.  Can I create a index for key1 - key3?  Is the proc sql intelligent enough to know that i have created index on key1 - key3 already and don't sort the data?

Alternatively, can I create index for each situation?  For example, one index for key1, one index for key1 and key3, ... etc.  When I summarize the data using proc sql under different class (by var1, var3 vs. by var1), is the proc sql clever enough to know which index to use?  Thank you!

Super User
Super User
Posts: 7,039

Re: tagsort option for proc sql

Posted in reply to caveman529

Index might not help if you are summarizing all of the data.  Would help if you are applying WHERE clause.

Also index might help if you have a lot of levels as you could then use PROC SUMMARY with a BY statement without running out of memory.

proc summary data=have nway ;

by key1 ;

class key2 ;

var var1 var2 ;

output out=want sum=;

run;

Regular Contributor
Posts: 161

Re: tagsort option for proc sql

I'll give that method a try.

The strange things is that the temporary/scratch drive has over 3 TB in space.  But when I see the log, I saw the following:

ERROR: Error occurred while reading from temporary sort file.

This seem to be different from insufficient space error.  Before this failure, the program was able to successfully complete a much larger sorting and summarizing.  I did a quick search on the Google and find nothing.  Thanks!

Super User
Super User
Posts: 7,039

Re: tagsort option for proc sql

Posted in reply to caveman529

Report it to SAS support.  Check the file structure used by the disk, perhaps it has problems with large files?

Ask a Question
Discussion stats
  • 7 replies
  • 839 views
  • 6 likes
  • 2 in conversation