Dear SAS community
I am trying to read in a huge txt file (70 GB) using this: but it keeps failing showing "disk full" (please see attached screenshot)
PROC IMPORT Out=work.gene1_2
DATAFILE=
"\\.psf\Home\Dropbox (Partners HealthCare)\BWH Cardiac MR\Grants Application\PIZ\data_analysis\Results\Genotype\Kwong_plate1&2_052217_finalreport.txt"
DBMS=dlm REPLACE;
delimiter='09'X;
RUN;
any solution would be much appreciated.
thanks
Raymond
Get a bigger hard drive, or allocate more resources to your SAS system. A large file of that size may well exceed the amount of disk space needed. Speak with your IT group, not much else I can say really.
Oh, actually, you may want to take the datastep code generated by SAS (in the log) from the proc import and modify it to meet the dataset structure, this is far better than letting proc import guess, and then you have to go in and change a load of things - especially with this amount of data.
Get a bigger hard drive, or allocate more resources to your SAS system. A large file of that size may well exceed the amount of disk space needed. Speak with your IT group, not much else I can say really.
Oh, actually, you may want to take the datastep code generated by SAS (in the log) from the proc import and modify it to meet the dataset structure, this is far better than letting proc import guess, and then you have to go in and change a load of things - especially with this amount of data.
sorry could you be more specific to "modify it to meet the dataset structure"?
thanks
When you read a text file with proc import, it generates a data step and runs it. That datastep can be found in the log (and copied from there).
That data step will tell us a lot about the structure of the resulting dataset, and maybe there are some options that might let you work around your resource problem.
So, proc import should generate some code in the log that looks something like:
data want; infile "your data file.txt"...; length...; informat ...; input ...; run;
This is the code that actually runs, proc import merely scans your data and guesses the best informats/lengths etc. and then generates this code. Editing this yourself is better as you know the data better.
Also note that using proc import has overheads of its own, it needs to read in a sample of your data and do processing over it to guess what the data structure is, therefore another good reason to drop the proc import and write the datastep directly.
Do not use WORK library. Try other.
libname x v9 'd:\temp\';
PROC IMPORT Out=X.gene1_2
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.