BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
lynnh20
Calcite | Level 5

I'm having issues where if I try to run PROC SQL or PROC SORT involving a larger medical claims dataset (~15 GB). I work off a VPN and need to write my data to a remote file storage system. My code submits without error in the log, but will not update in my file libraries. 

 

For example, the code I have been using is 

 

proc sort data = lib.large_data;

by ID Admit_date;

run;

 

and 

 

proc sql;

create table lib.merge as

select * 

from lib.small_data x

left join lib.large_data y 

on x.ID = y.ID;

quit;

 

a file titled lib.large_data.sas7bdat.lck will be created in my folder, but will not update. What I mean by this is that the file size will initialize at  0-168KB and will not grow any larger, no matter how long I let the code run. I understand that SAS creates .lck files to prevent overwrites, but I'm not sure why this is happening. 

 

What is also strange to me is that the code I'm showing has worked previously. I started having this issue in the past week, but I have been using this exact code to do merges for a few months.

 

If I use the PROC SQL statement:

 

create table lib.merge as

select * 

from lib.large data x 

where x.ID in (select y.ID from lib.small_data y);

quit;

 

then that allows me at least select correct records from the large dataset, but it's not ideal and I am still not able to sort the resulting dataset (~300 MB).  I can run other PROC statements on these datasets just fine. I.e, PROC CONTENTS, PROC MEANS, PROC FREQ, all work, but I can't sort or merge.

 

Does anyone have any idea why this is happening or how to resolve this issue? Any advice would be appreciated! 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

The lck file is created as soon as the PROC SORT starts, but then the procedure will read the source dataset and populate the utility file in WORK; during this, the lck file will stay unchanged. What you experience is a consequence of reading a dataset over a bad network connection.

Copy all datasets to a local library first, then do your work, and after that copy the final result back to the network storage.

View solution in original post

2 REPLIES 2
SASKiwi
Opal | Level 21

Reading and writing large data files over VPN to remote network storage is a recipe for very poor performance. I assume SAS is running on your local PC? How long was it taking when it was working?

 

I suggest running a test:

 

proc sql;
create table work.merge as
select * 
from lib.small_data x
left join lib.large_data y 
on x.ID = y.ID;
quit;

This will test how much faster it is writing to your local hard drive. If it is a lot faster and successful then it will prove that the VPN data connection is the problem.

 

Also this test should be even faster:

data work.small_data;
  set lib.small_data;
run;

proc sql;
create table work.merge as
select * 
from work.small_data x
left join lib.large_data y 
on x.ID = y.ID;
quit;

 

Kurt_Bremser
Super User

The lck file is created as soon as the PROC SORT starts, but then the procedure will read the source dataset and populate the utility file in WORK; during this, the lck file will stay unchanged. What you experience is a consequence of reading a dataset over a bad network connection.

Copy all datasets to a local library first, then do your work, and after that copy the final result back to the network storage.

SAS INNOVATE 2024

Innovate_SAS_Blue.png

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Get the $99 certification deal.jpg

 

 

Back in the Classroom!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 750 views
  • 0 likes
  • 3 in conversation