Hi folks,
We have a new SAS server on Linux running SAS9.4M7. We're on testing phase and we facing some issue when we process big tables (more than 20MM records and 100 columns).
When running PROC SORT or JOIN using this table, we have the following erro:
ERROR: Expecting page 91595, got page-1 instead.
ERROR: Page validation error while reading WORK.TABLE
ERROR: File WORK.TABLE.DATA is damaged. I/O processing did not complete.
We have almost 11TB of free disk space on WORK. We ran the IOTest and the results it's ok, it was validated for SAS Technical consultant.
If, for example, we used a KEEP reading less columns (44 was the limited), the process runs ok. The weird situation it's we have other environment, Linux too but with SAS 9.4M5 and the process execute fine on that environment. We verified the ulimit values, and are the same between the environments. The MEMSIZE and SORTSIZE options are bigger in the new environment.
Any idea?
Regards,
Hi folks,
Finally we unmounted, formatted and mounted again the filesystem and the problem was resolve. Thanks all for your support.
Regards,
Looks like you are running out of SAS user work space. Check user space quotas. Rerun the job and check space usage while it is running.
Thanks @SASKiwi . How could I check the user space quotas? We didn't define any specific user quotas. In fact the process stop when rich almost 600GB of WORK space used.
@MariaD - Your evidence suggests that there is some user-related storage limit rule is being applied. Check these anyway. I don't know what flavour of linux you are using but here is a link to the RHEL 7 doc: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/storage_administration...
Thanks @SASKiwi . I'll share the notes with server admin, tomorrow morning, and let you know. As far as I know, there is no limit (quotas) per user, but we'll check it again.
Regards,
I would look into any kind of workload management / quota system which may limit storage use by the user running the code.
Proc Sort takes up about 3 times the size of the data set to sort as there is the original data, a temp data set that is cleaned up when the sort finishes and the final sorted set.
If the issue is ONLY with Proc Sort sometimes the TAGSORT option can reduce the amount of temporary space used. Your trial with fewer variables does indicate a likely space limitation as part of the problem.
Depending on the type of JOIN you may want to provide specific code examples. Filtering data early in a query instead of after the join generally reduces the space used (Cartesian joins such as " From a, b" may be right out with sets this big)
Thanks @ballardw . Yes, we know PROC SORT or SQL JOIN without index use a lot of disk space. But the weird situation is these particularly process, created by an enduser, runs ok on our environment with 9.4M5. It's a simple LEFT JOIN between 2 SAS tables. The first one is big, not the second one. The join is for 2 columns.
If the dataset contains lots of wide character variables and is stored with the COMPRESS option, the utility files (which are uncompressed) can easily blow up your allotted disk storage.
Thanks @Kurt_Bremser . In fact, we made test with and without compress options. Because we are in testing phase, we have a lot of WORK area available, almost 11TB free. We monitored the free WORK space during executing and it never pass over 11% of usage.
We already have a ticket opened with SAS Support but, after more than 1 week we don't have any update on it.
Thanks @gwootton. Both tables are SAS and it created in previous steps. As test, we created 2 new big tables (totally random) and the same erro appears.
We also tested create a big file, directly on the filesystem (without SAS) to see if any erro appears after reach some size, but the file was created without any erro.
We also reviewed the block size for both environment and it has the same definition.
Hi @gwootton ,
I'm creating the test tables on WORK area. In one of my test, I created the tables first. Then opens it and navigate on each of them and everything were fine. After that test, closes the tables and I ran the JOIN process and the erro appears.
Follows the program used for the test (it's very simple). If I run the program with 5 character columns and 130 numeric columns, it executes fine. But if I increase to 140 numeric columns the error appears.
%LET COL=5 ;
%LET COL_N = 135;
DATA VALIDA;
ARRAY VAR_(&COL_N);
ARRAY STR_(&COL) $50.;
DO j=1 TO 30000000;
DO I=1 TO &COL_N;
VAR_(i) = RANUNI(0);
END;
DO I=1 TO &COL;
DO n=1 TO 40;
SUBSTR(STR_(i), n) = BYTE(INT(65 + 26 * RANUNI(0))) ;
END;
END;
OUTPUT;
END;
RUN;
DATA VALIDA2;
ARRAY VARI_(&COL_N);
ARRAY STRI_(&COL) $50.;
DO j=1 TO 30000000;
DO I=1 TO &COL_N;
VARI_(i) = RANUNI(0);
END;
DO I=1 TO &COL;
DO n=1 TO 40;
SUBSTR(STRI_(i), n) = BYTE(INT(65 + 26 * RANUNI(0))) ;
END;
END;
OUTPUT;
END;
RUN;
PROC SQL;
CREATE TABLE VALIDA3 AS
SELECT a.*,
b.VARI_1,
b.VARI_5
FROM VALIDA a
LEFT JOIN VALIDA2 B ON a.VAR_1 = b.VARI_2;
QUIT;
Regards.
So you do an SQL JOIN on two datasets with 30 million observations each?
With 140 numeric variables plus another 150 bytes in character variables a 30 million observation dataset is 41 Gigabytes.
Multiple that by 30million observations from the second table and that is a million Terabytes.
Do you have unique keys to the data or not?
Much better to just use a MERGE to lookup values from a second dataset.
data want;
merge A (in=inA) B(keep=id extra1 extra2);
by id;
if inA;
run;
The SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment.
Learn how to install the SAS Viya CLI and a few commands you may find useful in this video by SAS’ Darrell Barton.
Find more tutorials on the SAS Users YouTube channel.