I have a program that has been working for a couple of years but now when I run it on my local PC I get Sort Initialization failures which blows up the entire program.
The program is assigning a group of individual subjects to up to 5 local service centers based upon distance.
To save memory, I took the service center ids and distance from the subject address and assigned them previously. Now I am adding back in the details for each service center address via proc sql so that table is left joined to the (member-ServiceCenter) table 5 times to add the detail address data for each service center. When I do this on our SAS server, I don't have any issues but when I run this process locally on my laptop/PC the SAS session issues a SORT initialization error that blows up.
I read the article :
Usage Note 39705: Troubleshooting "Sort initialization failure" and "Sort execution failure" errors in Windows and UNIX environments (https://support.sas.com/kb/39/705.html )
but I am not sure I understand what I am supposed to do with the memory and sortsize parameters. I have my memory over 16Gb and my Sortsize at 8Gb. but the error keeps occurring. I have occasionally just kept running it over and over and eventually the program works.
The original service center table was nationwide so I filtered to just include the service center ids that are used by my current sample of individuals. So the original service center table was 77,000 locations but now if I am running the report for 100 individuals, then at most I will need only 500 service center ids in my table, and actually it is closer to 200 as there are many individuals within the same geographic area so that they can be assigned to the same service center. So now even though there are 5 copies of the table, they are each small. But I still get the sort initialization error. I have added the threaded sort option and that worked sometimes but lately that is not working either.
I guess I am looking for some configuration ideas. Or some explanation of why this might work on the server but not on the local PC. My memory configurations actually show that I am allocating even more memory on my local than the server allocates.
The gist of the query is :
Proc sql ;
create table customer_service_Details as
select a.name, a.address, a.city, a.state, a.zipcode, a.service_center_id1, a.service_center_id2, a.service_center_id3, a.service_center_id4, a.service_center_id5,
b.sc_name as sc_name1, b.sc_address as sc_address1 ......
c.sc_name as sc_name2, c.sc_address as sc_address2,.....
d......
e.......
f.sc_name as sc_name5, f.sc_address as sc_address5 ....
from
customer a
left outer join service_center_details b on a.service_center_id1 = b.service_center_id
left outer join service_center_details c on a.service_center_id2 = c.service_center_id
left outer join service_center_details d on a.service_center_id3 = d.service_center_id
left outer join service_center_details e on a.service_center_id4 = e.service_center_id
left outer join service_center_details f on a.service_center_id5 = f.service_center_id
;
quit;
This is simulated code and there are many additional fields added for each address.
Any guidance or help would be appreciated. I have worked with other members on my team as well and we are all stumped.
Carl
Have Windows Task Manager open while running the program. Does memory max out or not? Also check free disk space. Not enough free disk space can cause sort execution failures. You need at around 3 times the amount of space a dataset takes up to sort it.
Is your SAS installation 32 or 64 bit? 32-bit SAS only uses up to 2GB of memory.
So this is basically a multiple lookup into the same details table.
A SQL query like this in SAS is always a disaster waiting to happen (see Maxim 10).
Replace it with a data step and a hash object, you will see a performance improvement by several orders of magnitude.
As soon as I'm on my desktop, I'll cobble up some example code.
OK, here we go:
data details;
length
service_center_id $10
sc_name $30
sc_address $200
;
service_center_id = "1234567890";
sc_name = "Anton";
sc_address = "somewhere";
run;
data customer;
service_center_id1 = "1234567890";
service_center_id2 = "1234567890";
service_center_id3 = "1234567890";
service_center_id4 = "";
service_center_id5 = "";
run;
data customer_service_details;
set customer;
/* define all new and hash variables here */
length
service_center_id $10
sc_name
sc_name1
sc_name2
sc_name3
sc_name4
sc_name5
$30
sc_address
sc_address1
sc_address2
sc_address3
sc_address4
sc_address5
$200
;
/* make arrays for use in the DO loop */
array sc_k {*} service_center_id1-service_center_id5;
array sc_n {*} sc_name1-sc_name5;
array sc_a {*} sc_address1-sc_address5;
if _n_ = 1
then do; /* define hash */
declare hash dt (dataset:"details");
dt.definekey("service_center_id");
dt.definedata("sc_name","sc_address");
dt.definedone();
call missing(service_center_id,sc_name,sc_address); /* to avoid "uninitialized" NOTE */
end;
/* look up in a loop */
do i = 1 to dim(sc_k);
if dt.find(key:sc_k{i}) = 0
then do;
sc_n{i} = sc_name;
sc_a{i} = sc_address;
end;
end;
drop service_center_id sc_name sc_address i;
run;
The SQL requires a SORT (or at least creation of an internal index) for all 5 keys of the whole incoming dataset, while the data step works through the main dataset sequentially, and builds a b-tree for the keys of the lookup table only once. All "sorting" is done purely in memory, and only for the (considerably smaller) lookup table.
Even before the advent of the hash, I would not have used a single SQL for this.
Even running 5 SORTs and 5 data step MERGEs would usually outperform the SQL (unless you could hand over the SQL to a DBMS, if your data is actually stored there).
Next we turned to defining formats for each column of the lookup table, and use those, speeding up the process massively.
And then came hash, which outperforms formats once again. It requires a little bit more coding (but you avoid creating a CNTLIN dataset for each lookup column), but the performance benefits - well, you will see those.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.