Hi SAS friends,
Am trying to import an excel dataset (XLSX) that is ~0.505 GB and get this message, "Memory request error
ERROR: Not enough memory available to allocate storage."
Working on a PC with ~32 GB of RAM, and all files and SAS are on the solid state hard drive , there is no server.
Running proc options group=memory; run; results:
Group=MEMORY
SORTSIZE=1073741824
Specifies the amount of memory that is available to the SORT procedure.
SUMSIZE=0 Specifies a limit on the amount of memory that is available for data
summarization procedures when class variables are active.
MAXMEMQUERY=0 Specifies the maximum amount of memory that is allocated for procedures.
MEMBLKSZ=16777216 Specifies the memory block size for Windows memory-based libraries.
MEMMAXSZ=2147483648
Specifies the maximum amount of memory to allocate for using memory-based
libraries.
LOADMEMSIZE=0 Specifies a suggested amount of memory that is needed for executable programs
loaded by SAS.
MEMSIZE=2147483648
Specifies the limit on the amount of virtual memory that can be used during a
SAS session.
REALMEMSIZE=0 Specifies the amount of real memory SAS can expect to allocate.
The SAS online help is not clear on the exact code needed to address,
I've posted the import code for the excel file below, it is basic.
Any suggestions?
Thank you
proc import file="&Data.\ctg-studies_231206.xlsx" out= CT dbms=xlsx replace;
run;
I don't see how using Excel is going to help at all. It will probably just make the problem harder. And that is not even including the problem you got with trying to read the result.
What exactly is wrong with the file? Embedded commas are not a problem in a properly formatted CSV file. The values with commas do have quotes around them right?
If only some of the lines are messed up then run a simple data step to split the file into good and bad lines. Then you can probably just fix the bad line in a TEXT EDITOR and put the file back to together.
For example if the issue is that the person that generated the file did not add quotes around the commas you will end up with some lines that do not have the right number of fields on them. If the lines are short enough (32k bytes) you can use the _INFILE_ automatic variable.
So for example if you expected 20 fields per line you might split it like this:
data _null_;
infile 'original.csv' ;
input;
if countw(_infile_,',','qm') = 20 then file 'good.csv';
else file 'bad.csv';
put _infile_;
run;
Then fix the BAD lines, making fixed.csv.
Then read the two files together:
filename both ('good.csv','fixed.csv');
data want;
infile both dsd truncover firstobs=2;
length var1 $10 var2 8 ..... var20 8 ;
input var1 -- var20;
run;
I'd suggest increasing MEMSIZE as it's currently set to only 2GB. Try invoking sas from a Windows command line like this:
sas -memsize = 3GB
Thanks, I entered this into the Windows command prompt area, but its not working.:
cmd sas -memsize = 2GB
Is this command prompt wrong?
THank you
did enter 3GB and 9 GM actually ....
Actually from the command prompt a SAS window appears.
Here is the command prompt code and the SAS message:
start sas - memsize = 3gb
SAS MESSAGE:
ERROR: Unrecognized SAS option name .
ERROR: (SASXKRIN): KERNEL RESOURCE INITIALIZATION FAILED.
ERROR: Unable to initialize the SAS kernel.
Sorry this is the correct syntax:
sas -memsize 3G
no problem, thanks...
This line in the Windows command pane:
sasv9.cfg sas - memsize 3g
Reports details from sasv9.cfg , but just reports that : "-MEMSIZE 2G" (is unchanged).
getting closer!
Thanks SASkiwi, appreciate your input, and who knows, someday I might actually figure out how to increase the mem available to SAS.
Thanks again.
I would just use Excel to convert the XLSX file to a CSV file (or multiple CSV files if it has multiple sheets) and then write a data step to read that. Make sure to eliminate any linefeed characters that might appear in any of the cells in the XLSX file first as they can cause SAS to think the observation has ended because the LINE has ended.
Thanks very much but the original .csv file is a mess. It contains text variables that have commas within substrings, missing fields, some weird characters, and > 450,000 rows. I have tried importing it into SAS as a .csv file without success, so am using excel as an initial import step.
If there is a variable by variable process to import the .csv, that might work. There is an index column and then 29 other variable columns. Is it possible to import the index column + 1 variable column at a time , then cleanup the variable column, then eventually merge the variable columns back together by index values, ....? Or something else?
I don't see how using Excel is going to help at all. It will probably just make the problem harder. And that is not even including the problem you got with trying to read the result.
What exactly is wrong with the file? Embedded commas are not a problem in a properly formatted CSV file. The values with commas do have quotes around them right?
If only some of the lines are messed up then run a simple data step to split the file into good and bad lines. Then you can probably just fix the bad line in a TEXT EDITOR and put the file back to together.
For example if the issue is that the person that generated the file did not add quotes around the commas you will end up with some lines that do not have the right number of fields on them. If the lines are short enough (32k bytes) you can use the _INFILE_ automatic variable.
So for example if you expected 20 fields per line you might split it like this:
data _null_;
infile 'original.csv' ;
input;
if countw(_infile_,',','qm') = 20 then file 'good.csv';
else file 'bad.csv';
put _infile_;
run;
Then fix the BAD lines, making fixed.csv.
Then read the two files together:
filename both ('good.csv','fixed.csv');
data want;
infile both dsd truncover firstobs=2;
length var1 $10 var2 8 ..... var20 8 ;
input var1 -- var20;
run;
Thank you very much. Cool code and we learned about how to import a large messy .csv!
Was able to work with the original .csv file, using the code provided, to import, and then used a data step to cleanup the data further.
So that eliminated the need to expand the amount of memory available to SAS.
Much appreciated !
@rmacarthur wrote:
Thanks very much but the original .csv file is a mess. It contains text variables that have commas within substrings, missing fields, some weird characters, and > 450,000 rows. I have tried importing it into SAS as a .csv file without success, so am using excel as an initial import step.
If there is a variable by variable process to import the .csv, that might work. There is an index column and then 29 other variable columns. Is it possible to import the index column + 1 variable column at a time , then cleanup the variable column, then eventually merge the variable columns back together by index values, ....? Or something else?
Did the source of the CSV file provide any documentation of the contents?
Odd characters aren't generally a problem unless they get interpreted as "end of line" or similar control characters.
Also, before you attempted to read the CSV did you by any chance open it with spread sheet software and then save it when closing (such as would happen if you widened a column to read it, the spreadsheet thinks you "modified" the file and typically asks if you want to save the changes)? If so, it is very likely that the spreadsheet changed some of the contents of the file. If this possibly happened go back to the source and start over.
Proc Import, and the widgets that use it, assumes that you have a "nice" file where all of the rows are structured the same and by default only uses a small number of rows to guess properties for each column. If your file has a structure such as embedded "header" or "summary" information at the beginning / end of a related block of records then you will need a data step to parse the the file contents. Hopefully there is something that tells you what the breaks, changes, or other elements look like and when they appear or do not appear. Spreadsheet native formats are worse because each cell can have different characteristics in a single column, text, date, currency all in the same column. Proc Import will not handle any such "nicely" but default to character for everything typically.
If your file contents are not too sensitive you might try copying the first 50 lines, or if there is an obvious change in the layout enough lines to show maybe two complete groups, and pasting them into a TEXT BOX opened with the </> to show us the content. The text box is likely to be critical to keep the message windows from reformatting the pasted text.
FWIW, I remember when Excel wouldn't even handle 450K lines to begin with (and wouldn't tell you that it just ignored 270K lines when it finally opened) but SAS could read CSV files that size.
Thank you BallardW, much appreciated.
Great points, and I did avoid those pitfalls.
I do prefer not using excel to import messy data BC then, of course, one doesn't know what was left out and why.
and have just learned allot.
Thanks again !
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.