DATA Step, Macro, Functions and more

cannot read in a huge txt file using proc import

Accepted Solution Solved
Reply
Contributor
Posts: 39
Accepted Solution

cannot read in a huge txt file using proc import

Dear SAS community

I am trying to read in a huge txt file (70 GB) using this: but it keeps failing showing "disk full" (please see attached screenshot)Screen Shot 2017-06-08 at 8.18.36 AM.png

 

PROC IMPORT Out=work.gene1_2

DATAFILE

"\\.psf\Home\Dropbox (Partners HealthCare)\BWH Cardiac MR\Grants Application\PIZ\data_analysis\Results\Genotype\Kwong_plate1&2_052217_finalreport.txt"

DBMS=dlm REPLACE;

delimiter='09'X;

RUN;

 

 

any solution would be much appreciated.

 

thanks

Raymond

 


Accepted Solutions
Solution
‎06-08-2017 08:32 AM
Super User
Super User
Posts: 7,977

Re: cannot read in a huge txt file using proc import

Get a bigger hard drive, or allocate more resources to your SAS system.  A large file of that size may well exceed the amount of disk space needed.  Speak with your IT group, not much else I can say really.

Oh, actually, you may want to take the datastep code generated by SAS (in the log) from the proc import and modify it to meet the dataset structure, this is far better than letting proc import guess, and then you have to go in and change a load of things - especially with this amount of data.

View solution in original post


All Replies
Solution
‎06-08-2017 08:32 AM
Super User
Super User
Posts: 7,977

Re: cannot read in a huge txt file using proc import

Get a bigger hard drive, or allocate more resources to your SAS system.  A large file of that size may well exceed the amount of disk space needed.  Speak with your IT group, not much else I can say really.

Oh, actually, you may want to take the datastep code generated by SAS (in the log) from the proc import and modify it to meet the dataset structure, this is far better than letting proc import guess, and then you have to go in and change a load of things - especially with this amount of data.

Contributor
Posts: 39

Re: cannot read in a huge txt file using proc import

sorry could you be more specific to "modify it to meet the dataset structure"?

 

thanks 

 

Super User
Posts: 7,832

Re: cannot read in a huge txt file using proc import

When you read a text file with proc import, it generates a data step and runs it. That datastep can be found in the log (and copied from there).

That data step will tell us a lot about the structure of the resulting dataset, and maybe there are some options that might let you work around your resource problem.

 

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Super User
Posts: 7,977

Re: cannot read in a huge txt file using proc import

So, proc import should generate some code in the log that looks something like:

data want;
  infile "your data file.txt"...;
  length...;
  informat ...;
  input ...;
run;

This is the code that actually runs, proc import merely scans your data and guesses the best informats/lengths etc. and then generates this code.  Editing this yourself is better as you know the data better.  

Also note that using proc import has overheads of its own, it needs to read in a sample of your data and do processing over it to guess what the data structure is, therefore another good reason to drop the proc import and write the datastep directly.

 

Super User
Posts: 10,041

Re: cannot read in a huge txt file using proc import

Do not use WORK library. Try other.

 

libname x v9 'd:\temp\';

PROC IMPORT Out=X.gene1_2

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 125 views
  • 1 like
  • 4 in conversation