BookmarkSubscribeRSS Feed
LaurieF
Barite | Level 11

I have about forty jobs getting data from somewhere online (where? I don't know; online's everywhere). Most of them will pick up only a handle of incremental files each day and put them on our Linux terminal server; they then get pushed to an S3 bucket (somewhere else!), and from there read into Snowflake. As of today, it works perfectly, except for one job.

 

We've ironed out almost all the procedural problems, except for one. When I run the jobs through DI (I've created a bespoke transformation which takes the name of the source table and automates the whole process through to Snowflake), they all run fine. But when they've been deployed and run under the service account, the big job (which reads roughly new 11.5k files a day) always crashes. Today's run was when it attempted file 3,574.

 

Because the log becomes quickly unmanageable and for security reasons, I mask it by using option nomprint, but I expose where it's up to and the error messages.

 

From the log:

File 3,570: 07MAY2025:22:16:11 /org/warehouse/bin/gateway/edh/org_table_name/_change_data/cdc-00068-5dafcc5b-7512-4ecb-9175-91d01fb39600.c000.snappy.parquet
File 3,571: 07MAY2025:22:16:11 /org/warehouse/bin/gateway/edh/org_table_name/part-00066-c7df53d0-0747-4e30-8c39-16c9ac9d075b.c000.snappy.parquet
File 3,572: 07MAY2025:22:16:11 /org/warehouse/bin/gateway/edh/org_table_name/part-00067-e7b5f225-f351-448a-b5f9-4624b613fdc0.c000.snappy.parquet
File 3,573: 07MAY2025:22:16:11 /org/warehouse/bin/gateway/edh/org_table_name/part-00068-f00a655d-10c3-4b17-93dd-2083d213618b.c000.snappy.parquet
ERROR: tkzCapture() failed
ERROR: tkzCapture() failed
ERROR: tkzCapture() failed
ERROR: Unable to establish an SSL connection.
ERROR: Extension Load Failure: OS Error: -1 (/sso/sfw/sas/940/SASFoundation/9.4/sasexe/t0b4en.so: cannot open shared object file: Too many open files)
ERROR: Extension Load Failure: OS Error: -1 (/sso/sfw/sas/940/SASFoundation/9.4/sasexe/t0b4en.so: cannot open shared object file: Too many open files)
ERROR: Message file "t0b4en" is not found.
ERROR: Extension Load Failure: OS Error: -1 (/sso/sfw/sas/940/SASFoundation/9.4/sasexe/t0b4en.so: cannot open shared object file: Too many open files)
ERROR: Extension Load Failure: OS Error: -1 (/sso/sfw/sas/940/SASFoundation/9.4/sasexe/t0b4en.so: cannot open shared object file: Too many open files)
ERROR: Message file "t0b4en" is not found.
ERROR: Message file is not loaded.
ERROR: Extension Load Failure: OS Error: -1 (/sso/sfw/sas/940/SASFoundation/9.4/sasexe/t0b4en.so: cannot open shared object file: Too many open files)
ERROR: Extension Load Failure: OS Error: -1 (/sso/sfw/sas/940/SASFoundation/9.4/sasexe/t0b4en.so: cannot open shared object file: Too many open files)
ERROR: Message file "t0b4en" is not found.
ERROR: Extension Load Failure: OS Error: -1 (/sso/sfw/sas/940/SASFoundation/9.4/sasexe/t0b4en.so: cannot open shared object file: Too many open files)
ERROR: Extension Load Failure: OS Error: -1 (/sso/sfw/sas/940/SASFoundation/9.4/sasexe/t0b4en.so: cannot open shared object file: Too many open files)
ERROR: Message file "t0b4en" is not found.
ERROR: Extension Load Failure: OS Error: -1 (/sso/sfw/sas/940/SASFoundation/9.4/sasexe/t0a2en.so: cannot open shared object file: Too many open files)
ERROR: Extension Load Failure: OS Error: -1 (/sso/sfw/sas/940/SASFoundation/9.4/sasexe/t0a2en.so: cannot open shared object file: Too many open files)
ERROR: Message file "t0a2en" is not found.
ERROR: Message file is not loaded.
ERROR: Extension Load Failure: OS Error: -1 (/sso/sfw/sas/940/SASFoundation/9.4/sasexe/t0a2en.so: cannot open shared object file: Too many open files)
ERROR: Extension Load Failure: OS Error: -1 (/sso/sfw/sas/940/SASFoundation/9.4/sasexe/t0a2en.so: cannot open shared object file: Too many open files)
ERROR: Message file "t0a2en" is not found.
ERROR: Extension Load Failure: OS Error: -1 (/sso/sfw/sas/940/SASFoundation/9.4/sasexe/t0b4en.so: cannot open shared object file: Too many open files)
ERROR: Extension Load Failure: OS Error: -1 (/sso/sfw/sas/940/SASFoundation/9.4/sasexe/t0b4en.so: cannot open shared object file: Too many open files)
ERROR: Message file "t0b4en" is not found.
ERROR: Extension Load Failure: OS Error: -1 (/sso/sfw/sas/940/SASFoundation/9.4/sasexe/t0b4en.so: cannot open shared object file: Too many open files)
ERROR: Extension Load Failure: OS Error: -1 (/sso/sfw/sas/940/SASFoundation/9.4/sasexe/t0b4en.so: cannot open shared object file: Too many open files)
ERROR: Message file "t0b4en" is not found.
ERROR: Message file is not loaded.
WARNING: Apparent symbolic reference SYS_PROCHTTP_STATUS_CODE not resolved.
WARNING: Apparent symbolic reference SYS_PROCHTTP_STATUS_CODE not resolved.
ERROR: A character operand was found in the %EVAL function or %IF condition where a numeric operand is required. The condition was: &sys_prochttp_status_code > 200 
ERROR: %EVAL function has no expression to evaluate, or %IF statement has no condition.

I think that sys_prochttp_status_code is destroyed at the top of each http call and created again very soon after, so I suspect that the error is being picked up at the procedure initialisation.

 

I've checked both my and the service account's Linux ulimit values - both 350,000, so the Too many open files would appear to be a red herring.

 

Here's the meat of the getfiles macro:

%do i = 1 %to &files;
    %let rc = %sysfunc(fetchobs(&dsid, &i));
    %let file = %sysfunc(strip(&file));
    %if %eval(%sysfunc(indexc(&file, %str(/)))) %then %do;
        %let sub_directory = %sysfunc(scan(&file, 1, %str(/)));
        %if %eval(%sysfunc(fileexist(&parent_directory/&source/&sub_directory)) = 0) %then /* Create each non-existant directory */
            %let rc = %sysfunc(dcreate(&sub_directory, &parent_directory/&source));
        %end;
    %if %eval(%sysfunc(fileexist(&parent_directory/&source/&file)) = 1) %then              /* Don't bother re-getting a file */
        %goto EndLoop;
    filename source "&parent_directory/&source/&file";
    %let url = https://&source_url/files/download?;
    %let url = &url.tableName=&source.%nrstr(&file=)&file;
    %let fail_count = 0;
/*
    Every (hour - 500 seconds), get another bearer code. It is only valid for an hour, so 500 seconds short
    will (prob'ly) always work. If it doesn't, something else has gone wrong. This should be good for around 12-15,000 files at a time.
*/
    %if %sysevalf(%sysfunc(datetime()) > "&bearer_expiry"dt) %then
        %renew_bearer;
    %do %until(%eval(&sys_prochttp_status_code) = 200);
        proc http url="&url"
             proxyhost="http://webproxy.vsp.sas.com:3128" 
             oauth_bearer="&bearer"
             in='scope=urn://onmicrosoft.com/vcp/api/vbi/.default'
             out=source
             timeout=1000                    /* How long to wait (seconds) */
             method='get';
        headers 'Accept' = 'application/json'
                'consistencylevel' = 'eventual';
        run;
        %if %eval(&sys_prochttp_status_code > 200) %then %do;
            %put %sysfunc(strip(%sysfunc(datetime(), datetime23.3))) HTTP Status code: &sys_prochttp_status_code %refnumv(val=&i) &=file;
            %let fail_count = %eval(&fail_count + 1);
            %if %eval(&fail_count > 5) %then %do;
                %check_status
                %goto EndMac;
                %end;
            %let rc = %sysfunc(sleep(30, 1));
            %end;
        %end;
    %put File %refnumv(val=&i): %sysfunc(strip(%sysfunc(datetime(), datetime19.))) %sysfunc(strip(%sysfunc(putn(&lastmodified, datetime23.)))) &parent_directory/&source/&file;
    filename source clear;
    %EndLoop:
    %end;

I could obviously check for the symbol existence of sys_prochttp_status_code before I check its contents - but its non-existence isn't something I had considered!

I'm pretty much convinced that it's something specific with the service account, but my ingestion jobs run through it literally thousands of times a day without error, including many that use proc http, and I've never seen this before.

 

Has anyone ever seen anything like this before? What is tkzCapture()? what are toa4en and t0b4en, and why can't they be (re-)opened? They do exist - seven years old. Maybe it's a factor of running M6; M8 may be getting installed mid-year.

6 REPLIES 6
andreas_lds
Jade | Level 19

Sorry, i have no idea what could cause the error message. But if a migration is planed later this year waiting for M9 could be an option. It should be released mid year.

LaurieF
Barite | Level 11
Yeah I know, but I have no influence over when it gets installed. I work for a very large organisation, with many SAS users, so arranging for even M8 to be installed is quite a thing.

On top of that, my code is supposed to go live next month.

Laurie
LaurieF
Barite | Level 11
As you can see from the code, I’m doing that. There is one filename reference which I clear at the bottom of the loop.

Laurie
ballardw
Super User

@LaurieF wrote:
As you can see from the code, I’m doing that. There is one filename reference which I clear at the bottom of the loop.

Laurie

Maybe you are closing it.

This bit of your code has a %goto Endmac but I do not see the label Endmac in your code. So this may be skipping completely out this macro to somewhere else.

            %if %eval(&fail_count > 5) %then %do;
                %check_status
                %goto EndMac;
                %end;
            %let rc = %sysfunc(sleep(30, 1));
            %end;
        %end;
    %put File %refnumv(val=&i): %sysfunc(strip(%sysfunc(datetime(), datetime19.))) %sysfunc(strip(%sysfunc(putn(&lastmodified, datetime23.)))) &parent_directory/&source/&file;
    filename source clear;
    

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 364 views
  • 0 likes
  • 4 in conversation