BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SamMad
Obsidian | Level 7

Hello Everyone, I am a newbie to SAS (Enterprise Guide). I am trying to import a large CSV data ( 7GB file) of about 18 Million records. I have copied the file to SAS server using WinSCP and I now trying to import that file to a SAS Dataset. I did use the %LET command to specify the Source and Destination. I turned off the Data cleaning checks but sometime after 2 hours or so . I get the following error:

Insufficient memory to continue the execution of the program.

I checked the space on the destination folder and I have over 450GB so that cannot be a problem. Any suggestions?

Also since I am a newbie, does SAS first tries to import a file into another form of text in SAS and then as a second step, it takes that imported file and converts to SAS format data set?

 

Appreciate community help if they help in the right direction.

 

Thanks,

SamMad

1 ACCEPTED SOLUTION

Accepted Solutions
TomKari
Onyx | Level 15

Glad to hear you made it this far! Like every other high-end product these days, SAS has multiple ways to import .csv files.

 

1. Using the "File | Import Data" dialogue in Enterprise Guide. In your case, DON'T use this, as it wants to examine the data on your local PC, and of course your data is on the server. This is a great option if you have a small dataset on your PC.

 

2. SAS has a software tool called "PROC IMPORT" which will bring data from a variety of formats into SAS. I was able to import a .csv version of the sashelp.cars dataset using this syntax:

proc import datafile="C:\ddd\cars.csv" out=cars dbms=csv;
run;

One nice thing about this option is it prints in the log the SAS code that it generates, so you can grab it and tailor it to your needs.

I suggest you give that code a try, and see what you think.

 

3. Just write plain old SAS code. This is my option of choice, since i) I've been around SAS since before .csv files existed, or PROC IMPORT or Enterprise Guide, for that matter, and ii) it gives me exquisite control over what's happening with my data. Comparing with the massive code generated by PROC IMPORT, I can pull in my .csv of the sashelp cars dataset with just

data work.cars;
length Make $13 Model $39 Type $6 Origin $6 DriveTrain $5 MSRP 8 Invoice 8 Engine_Size 8 Cylinders 8 Horsepower 8 MPG_City 8 MPG_Highway 8 Weight 8 Wheelbase 8;
infile "C:\Users\Tom\Desktop\cars.csv" lrecl=32767 dsd firstobs=2;
input Make Model Type Origin DriveTrain MSRP Invoice Engine_Size Cylinders Horsepower MPG_City MPG_Highway Weight Wheelbase;
run;

I know I'm kind of shoving a firehose up your nose and turning it on, it's a lot to learn. Ponder on and read about these, and come back with questions. I'm not going to be happy until you tell me you love SAS.

Tom

View solution in original post

33 REPLIES 33
Reeza
Super User
Whats the exact error message? Are you doing this on a desktop or connected to a server? Is 450GB RAM (memory) or disk space?
SamMad
Obsidian | Level 7

THank you for your prompt reply. I have 450GB on SAS Server Disk apce allocated to me. My Source CSV file is stored in a folder on SAS Server and Destination is also on the SAS Server. I just need to convert this CSV file to SAS Dataset. CSV file is about 7GB in Size and about 18 Mil records with 100 Columns. Did i answer your Q?

Reeza
Super User
No. Exact error message?
SamMad
Obsidian | Level 7
Insuffcient memory to continue the exucution of the program.
Kurt_Bremser
Super User

7 GB divided by 18 million gives me an average record length of less than 400, so your 100 columns can't be that wide.

Your best bet is to write the import step yourself, copy/pasting the column names and attributes from the documentation you received with the file.

And do not let EG open the dataset automatically after import.

Tom
Super User Tom
Super User

How are you "importing" it?  Don't use PROC IMPORT to read text files.  Just write a data step to read it.  It will be faster and more accurate.  And there is no way it will get "memory" errors.  Although make sure your target disk has enough space for the resulting dataset.

So if you had 100 numeric variables your data step is as simple as this.

data outlib.want;
  infile 'myfile.csv' dsd truncover firstobs=2 ;
  input var1-var100;
run;

For more complex data just define your variables (LENGTH statement) attach any REQUIRED informat (usually only date,time or datetime need informats) and REQUIRED (usually only date, time and datetime need formats) and any DESIRED labels or formats.  If you define your variables in the dataset in the same order they appear in the text file the input can use a position based variable list.

input firstvar -- lastvar;
ballardw
Super User

I think you may be running into an EG option that automatically attempts to display data sets as they are created. That display window may be what is running out of memory

 

You might try going to the Tools>Options>Data>Data General and uncheck the box for "Automatically open data when added to project"

 

 

SamMad
Obsidian | Level 7
Thank you. I have changed the setting and will try again to run the import function to see if it helps.
TomKari
Onyx | Level 15

Another possibility is that EG (which runs on your local PC) is trying to "look" at the csv file to figure out how to import it. As you're running SAS on a server, your best bet might be to not let EG "help" you at all.

 

Here's a program that will pull in the first six records of your csv file, and dump them to the log:

 

data _null_;
	infile "C:\ddd\cars.csv" lrecl=32767;
	input;
	putlog _infile_;

	if _n_ = 6 then
		stop;
run;

Substitute your directory and file, and see if it works. (Also, if you need troubleshooting later on, it gives you something to post to this site). Here's what my log looks like (I dumped sashelp.cars to a csv file):

 

NOTE: The infile "C:\ddd\cars.csv" is:
      Filename=C:\ddd\cars.csv,
      RECFM=V,LRECL=32767,File Size (bytes)=38595,
      Last Modified=August 14, 2020 17:20:13,
      Create Time=August 14, 2020 17:20:13

Make,Model,Type,Origin,DriveTrain,MSRP,Invoice,Engine Size (L),Cylinders,Horsepower,MPG (City),MPG (Highway),Weight (LBS),Wheelbase 
(IN),Length (IN)
Acura, MDX,SUV,Asia,All,"$36,945","$33,337",3.5,6,265,17,23,4451,106,189
Acura, RSX Type S 2dr,Sedan,Asia,Front,"$23,820","$21,761",2,4,200,24,31,2778,101,172
Acura, TSX 4dr,Sedan,Asia,Front,"$26,990","$24,647",2.4,4,200,22,29,3230,105,183
Acura, TL 4dr,Sedan,Asia,Front,"$33,195","$30,299",3.2,6,270,20,28,3575,108,186
Acura, 3.5 RL 4dr,Sedan,Asia,Front,"$43,755","$39,014",3.5,6,225,18,24,3880,115,197
NOTE: 6 records were read from the infile "C:\ddd\cars.csv".
      The minimum record length was 72.
      The maximum record length was 148.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

Assuming that works, here's another code snippet, that will simply run through your file and save the text of each record to a work dataset:

data work.check(compress=char);
	length InRec $32767;
	infile "C:\ddd\cars.csv" lrecl=32767;
	input;
	InRec = _infile_;
run;

I don't see anything weird in the description of your file or your system, but if there is this will hopefully bring it front and centre.

Neither of these programs use EG at all, so there shouldn't be any heavy communications.

Let us know what happens!

   Tom

 

 

 

SamMad
Obsidian | Level 7
Hello Tom,

Thanks for looking deep into this. I am trying to run your first snippet of code to view first six records of the large 7GB file and it I kept on bumping into an error. Even though I am specifying the location of the infile path to "C:\users\my folder\ABC.csv", the program runs throws an error, as mentioned below:
ERROR: Physical file does not exist, /opt/apps/sas/computer/config/Lev1/SAS/C:\users\my folder\ABC.csv

I know that file does exist in the folder C:\users\my folder\ABC.csv

What could be an issue that he cant file the file as located?

Thanks,
Reeza
Super User
You shouldn't process 7GB files with SAS UE - it's a limited version. You need the full version of SAS for this type of data.
It may work but it will take a long time....
SamMad
Obsidian | Level 7
Hi Reeza,
Thanks for your reply. I am using SAS EG so I not sure if it the same as UE? What is UE if it is different from EG? I was told by my Org's SAS team that this version is being used by Analytics community enterprise wide and I can load very large data sets on the SAS Server without any issues.
Your helpful input is much appreciated.
Thanks,
Tom
Super User Tom
Super User

Looks like SAS is running on a Unix machine, not your Windows machine.

Does that file already exist on that machine? Or at least a network drive that is accessible from the Unix machine?

 

If you are using Enterprise Guide to connect to the SAS server you might get more control by not using the Import Task.  Instead use the File Upload task (if the file is not already on that Unix machine somewhere).  Then you can reference the file from its location on the Unix machine where SAS is running.

TomKari
Onyx | Level 15

AHA! NOW we're making some progress. That's exactly why I suggested running that code, as this is a common problem.

 

The code runs on your SAS server, which means that you have to point it at the file that you uploaded to the server. I can see from the error message syntax that it's a *NIX server, so you're INFILE statement should be something like

 

infile "/home/sammad/ABC.csv" lrecl=32767;

 

pointing to where you copied the file.

 

Get this piece of code successfully accessing the .csv file, and we can move on from there.

 

Tom

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 33 replies
  • 4064 views
  • 13 likes
  • 7 in conversation