Solved: Re: how to solve the error: sort execution failure

freda · Posted 08-05-2018 01:06 PM

Hi, dear all,

After several times, I always have the following error message:

ERROR: Sort execution failure.

Here's my code:

proc options group=memory;
run;
proc options option=(work utilloc) value;
run;
proc options option=threads;
run;
options fullstimer msglevel=i;

proc import out=work.lal1
datafile='C:\thesis1\ideas\agent_quality\sas.dta'
dbms=DTA replace;
run;

data have;
set work.lal1;
run;

proc sql;
create table table1 as
select unique have.*, count(past1.date_sign) as la2_no_la2
from have
left join
have past1
on have.listing_2=past1.listing_2
and 0 < have.date_sign-past1.date_sign
group by 1
order by 1 ;

It works before, but since yesterday, I have the above error.

I google the reasons and try some methods suggested by others, but the error msg still there.

Can anyone help me to solve it?

Thank you

Freda

hashman · Posted 08-06-2018 12:02 AM

@freda:

Your SQL has to sort HAVE twice to make the query happen, which is evident from the notes the SQL statement _METHOD option prints in the log:

sqxcrta
    sqxunqs
        sqxsumg
            sqxjm
                sqxsort
                    sqxsrc( WORK.HAVE(alias = PAST1) )
                sqxsort
                    sqxsrc( WORK.HAVE )

You can help SQL by sorting HAVE just once beforehand. Instead of the pointless DATA step that merely creates a copy of LAL1, sort LAL1 into HAVE:

proc sort data = lal1 out = have ;
  by listing_2 date_sign ;
run ;

Proc SORT sets SORTEDBY=LISTING_2 DATE_SIGN data set option on in the descriptor of HAVE, from which SQL will know that the input is already sorted. In this case, when you run the query, you will see the execution methods as:

sqxcrta
    sqxunqs
        sqxsumg
            sqxjm
                sqxsrc( WORK.HAVE(alias = PAST1) )
                sqxsrc( WORK.HAVE )

As you see, there's no SQXSORT execution method listed, i.e. SQL takes the existing sorted order into account and refrains from sorting itself, which will obviously make the SQL sort error go away.

It would be even better if your LAL1 is already intrinsically sorted - meaning that its data are physically in order but the SORTEDBY=LISTING_2 DATE_SIGN is not set. In this case, you don't need to sort anything, just tell SQL that the data are already sorted by specifying SORTEDBY=:

option msglevel=I ;
proc sql _method ;
  create table table1 as
  select unique have.*, count(past1.date_sign) as la2_no_la2
  from have (sortedby=listing_2 date_sign)
  left join
  have (sortedby=listing_2 date_sign) past1
  on have.listing_2=past1.listing_2
  and have.date_sign > past1.date_sign
  group by 1
  order by 1 
  ;
quit ;

In this case, the SQL execution will skip the sorting as well:

sqxcrta
    sqxunqs
        sqxsumg
            sqxjm
                sqxsrc( WORK.HAVE(alias = PAST1) )
                sqxsrc( WORK.HAVE )

If the proc SORT step should fail, it would mean you have a problem with the file size and/or available SORTWORK space and need a conversation with the sysadmin.

If that should fail, you can use the dinosaur method of breaking LAL1 into a number of more or less equal chunks, sorting each, and interleaving the results. For example, if the file has 30 million observations and you break it into 3 chunks, it could look like this:

proc sort data=lal1 (firstobs=00000001 obs=10000000) out=c1 ; by listing_2 date_sign ; run ;
proc sort data=lal1 (firstobs=10000001 obs=20000000) out=c2 ; by listing_2 date_sign ; run ;
proc sort data=lal1 (firstobs=20000001 obs=30000000) out=c3 ; by listing_2 date_sign ; run ;
                                                                                            
data have (sortedby=listing_2 date_sign) ;                                                  
  set c1 c2 c3 ;                                                                            
  by listing_2 date_sign ;                                                                  
run ;

Then just run the SQL query against HAVE. Of course, if need be, you can break LAL1 into a greater number of finer chunks. But in the end, if you have enough WORK space to store the chunks (which could be subsequently deleted) and HAVE, you'll get there by hook or by crook. This technique essentially trades the need for SORTWORK space about three times the size of LAL1 for extra WORK space about the same size as LAL1.

Paul D.

View solution in original post

error_prone · Posted 08-05-2018 01:38 PM

Please post the log of the run and list the things you already tried to avoid useless suggestions.

freda · Posted 08-05-2018 01:42 PM

1 proc options group=memory;
2 run;

SAS (r) Proprietary Software Release 9.4 TS1M4

Group=MEMORY
SORTSIZE=1073741824
Specifies the amount of memory that is available to the SORT procedure.
SUMSIZE=0 Specifies a limit on the amount of memory that is available for data summarization procedures when
class variables are active.
MAXMEMQUERY=0 Specifies the maximum amount of memory that is allocated for procedures.
MEMBLKSZ=16777216 Specifies the memory block size for Windows memory-based libraries.
MEMMAXSZ=2147483648
Specifies the maximum amount of memory to allocate for using memory-based libraries.
LOADMEMSIZE=0 Specifies a suggested amount of memory that is needed for executable programs loaded by SAS.
MEMSIZE=2147483648
Specifies the limit on the amount of virtual memory that can be used during a SAS session.
REALMEMSIZE=0 Specifies the amount of real memory SAS can expect to allocate.
NOTE: PROCEDURE OPTIONS used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds

3 proc options option=(work utilloc) value;
4 run;

SAS (r) Proprietary Software Release 9.4 TS1M4

Option Value Information For SAS Option WORK
Value: C:\Users\Samuel\AppData\Local\Temp\SAS Temporary Files\_TD10256_SAMUEL-PC_
Scope: SAS Session
How option value set: Config File
Config file name:
C:\Program Files\SASHome\SASFoundation\9.4\nls\u8\sasv9.cfg

Option Value Information For SAS Option UTILLOC
Value: WORK
Scope: Default
How option value set: Shipped Default

NOTE: PROCEDURE OPTIONS used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds

5 proc options option=threads;
6 run;

SAS (r) Proprietary Software Release 9.4 TS1M4

THREADS Uses threaded processing for SAS applications that support it.
NOTE: PROCEDURE OPTIONS used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds

7 options fullstimer msglevel=i;
8
9 proc import out=work.lal1
10 datafile='C:\thesis1\ideas\agent_quality\sas.dta'
11 dbms=DTA replace;
12 run;

NOTE: The import data set has 1553099 observations and 15 variables.
NOTE: WORK.LAL1 data set was successfully created.
NOTE: PROCEDURE IMPORT used (Total process time):
real time 2.13 seconds
user cpu time 0.29 seconds
system cpu time 0.20 seconds
memory 982.87k
OS Memory 9960.00k
Timestamp 2018-08-05 12:57:29 PM
Step Count 4 Switch Count 0

13
14 data have;
15 set work.lal1;
16 run;

NOTE: There were 1553099 observations read from the data set WORK.LAL1.
NOTE: The data set WORK.HAVE has 1553099 observations and 15 variables.
NOTE: DATA statement used (Total process time):
real time 0.28 seconds
user cpu time 0.07 seconds
system cpu time 0.20 seconds
memory 882.31k
OS Memory 10476.00k
Timestamp 2018-08-05 12:57:30 PM
Step Count 5 Switch Count 0

17
18
19 proc sql;
20 create table table1 as
21 select unique have.*, count(past1.date_sign) as la2_no_la2
22 from have
23 left join
24 have past1
25 on have.listing_2=past1.listing_2
26 and 0 < have.date_sign-past1.date_sign
27 group by 1
28 order by 1 ;
NOTE: The query requires remerging summary statistics back with the original data.
NOTE: SAS threaded sort was used.
ERROR: Sort execution failure.

Kurt_Bremser · Posted 08-05-2018 03:56 PM

This step

data have;
set work.lal1;
run;

is totally useless. It wastes three lines of code and the whole disk space of the dataset.

Finally, you are most probably running out of disk space when SQL has to sort your dataset for the join or for the unique.

Omit the above step and use lal1 as the source in your SQL.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

freda · Posted 08-06-2018 02:58 PM

Thank you so much and sorry for the late response

hashman · Posted 08-06-2018 12:02 AM

@freda:

Your SQL has to sort HAVE twice to make the query happen, which is evident from the notes the SQL statement _METHOD option prints in the log:

sqxcrta
    sqxunqs
        sqxsumg
            sqxjm
                sqxsort
                    sqxsrc( WORK.HAVE(alias = PAST1) )
                sqxsort
                    sqxsrc( WORK.HAVE )

You can help SQL by sorting HAVE just once beforehand. Instead of the pointless DATA step that merely creates a copy of LAL1, sort LAL1 into HAVE:

proc sort data = lal1 out = have ;
  by listing_2 date_sign ;
run ;

Proc SORT sets SORTEDBY=LISTING_2 DATE_SIGN data set option on in the descriptor of HAVE, from which SQL will know that the input is already sorted. In this case, when you run the query, you will see the execution methods as:

sqxcrta
    sqxunqs
        sqxsumg
            sqxjm
                sqxsrc( WORK.HAVE(alias = PAST1) )
                sqxsrc( WORK.HAVE )

As you see, there's no SQXSORT execution method listed, i.e. SQL takes the existing sorted order into account and refrains from sorting itself, which will obviously make the SQL sort error go away.

It would be even better if your LAL1 is already intrinsically sorted - meaning that its data are physically in order but the SORTEDBY=LISTING_2 DATE_SIGN is not set. In this case, you don't need to sort anything, just tell SQL that the data are already sorted by specifying SORTEDBY=:

option msglevel=I ;
proc sql _method ;
  create table table1 as
  select unique have.*, count(past1.date_sign) as la2_no_la2
  from have (sortedby=listing_2 date_sign)
  left join
  have (sortedby=listing_2 date_sign) past1
  on have.listing_2=past1.listing_2
  and have.date_sign > past1.date_sign
  group by 1
  order by 1 
  ;
quit ;

In this case, the SQL execution will skip the sorting as well:

sqxcrta
    sqxunqs
        sqxsumg
            sqxjm
                sqxsrc( WORK.HAVE(alias = PAST1) )
                sqxsrc( WORK.HAVE )

If the proc SORT step should fail, it would mean you have a problem with the file size and/or available SORTWORK space and need a conversation with the sysadmin.

If that should fail, you can use the dinosaur method of breaking LAL1 into a number of more or less equal chunks, sorting each, and interleaving the results. For example, if the file has 30 million observations and you break it into 3 chunks, it could look like this:

proc sort data=lal1 (firstobs=00000001 obs=10000000) out=c1 ; by listing_2 date_sign ; run ;
proc sort data=lal1 (firstobs=10000001 obs=20000000) out=c2 ; by listing_2 date_sign ; run ;
proc sort data=lal1 (firstobs=20000001 obs=30000000) out=c3 ; by listing_2 date_sign ; run ;
                                                                                            
data have (sortedby=listing_2 date_sign) ;                                                  
  set c1 c2 c3 ;                                                                            
  by listing_2 date_sign ;                                                                  
run ;

Then just run the SQL query against HAVE. Of course, if need be, you can break LAL1 into a greater number of finer chunks. But in the end, if you have enough WORK space to store the chunks (which could be subsequently deleted) and HAVE, you'll get there by hook or by crook. This technique essentially trades the need for SORTWORK space about three times the size of LAL1 for extra WORK space about the same size as LAL1.

Paul D.

freda · Posted 08-06-2018 02:58 PM

Thank you and sorry for the late reply

SAS Innovate 2025: Call for Content

Classroom Training Available!