Hi,
My program abruptly ended throwing out an I/O error when trying to perform a join via proc sql step. The input dataset holds close to million records. The program was running on a remote server which also bases the SAS Software. The programmer who used to run this program for close to 3 years mentioned that she never got this error in the code. I checked the SAS Work directory and we have enough space.
I re-ran the program but the same error appeared. Then, I reduced the number of observations using the OBS= option and this time it ran just fine. But, the full run is not successful. We re-booted the server, then the code ran fine even for full observations. Is the error really something to do with the server? In that case, would a smaller dataset escape from being caught?
Appreciating your help.
Thanks
Gnans
Message was edited by: Gnans
Given the diagnostics, the most likely reason is that you are running out of resources.
How do you know that your saswork is big enough?
A rule of thumb is to have the saswork at least 3-4 times bigger than the largest table in a join.
If RAM is available, raising MEMSIZE and SORTSIZE might help.
You could monitor the server as your query is running, both according to CPU, memory and disk usage.
You might also want to optimize the query, i.e. addining indexes. Using the PROC SQL option_method will help you understand how SAS is executing the query.
Given the diagnostics, the most likely reason is that you are running out of resources.
How do you know that your saswork is big enough?
A rule of thumb is to have the saswork at least 3-4 times bigger than the largest table in a join.
If RAM is available, raising MEMSIZE and SORTSIZE might help.
You could monitor the server as your query is running, both according to CPU, memory and disk usage.
You might also want to optimize the query, i.e. addining indexes. Using the PROC SQL option_method will help you understand how SAS is executing the query.
Thanks for your reply. Yes, I checked the SAS Work location which has been given 600 GB of space while the table is 90 GB. Since, the code was running without I/O error for years, few suggested that the server be monitored.But, the admin wants to the list of components that are to be monitored. Any idea on the components? I tried SORTSIZE but to no avail. The code has been split into simple steps and currently being tested in Data step. So I haven't tested _method option yet.
Gnans
Disk and memory usage should be the most interesting to monitor.
If you run with less data, add OPTIONS FULLSTIMER MSGLEVEL=I; to your program - you will have some more detailed information on resource consumption in the log.
What do you mean by " but to no avail"? If your job is alone on the server, try set MEMESIZE to 75% of actual memory, and SORTSIZE to 80-90% of MEMSIZE.
Gnans,
It may not be a resource issue. It may be that one of the data sets has been corrupted, toward the end of the file. You wouldn't have any problem reading the beginning of the data set (with OBS=), but would run into trouble when hitting the corrupted spot.
Try a simple test. Just read in each source separately without even saving anything. Just verify that you can actually read each data source.
Good luck.
Beside of everything else which has already been said:
90GB for 1M rows means that you use around 94KB per row. Is this possible? Or do you have deleted obs in your dataset which possibly is now constantly growing untill you run out of space? Proc Contents will tell you.
Sorry its my mistake. The table contains 234,773,795 rows.
Hi,
nothreads options can help.
Proc sql nothreads;
:
:
quit;
proc sort data=<datasetname> nothreads out=<outdasetname>
by <key>;
run;
geraldo brasil.
this is;;;
I would say, I/O error can be very vague.
It would be of great help if you could post the error, and give us some details about the table you are trying to process and the type o data manipulation.
Could be a corrupted data set, as Astounding said.
Could be the WORK library running out of space. Depending on what you're doing with the data, you may need 3 to 4 times the size (sas sort on windows for example) of input data set in available space. And don't forget that the WORK library is normally shared between all SAS processes running in the machine.
Cheers from Portugal.
Daniel Santos @ www.cgd.pt
Update:
The code was now executed successfully without I/O error, after the server was re-booted. The admin suggested that there was another memory intensive app sharing the server which might have used up resources.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.