Hello all --
I've been having a tough time trying to import a huge text file, it's around 12GB and roughly 35 million rows -- my goal is to simply create an indexed SAS dataset for a data library. The file is a CSV file, "\001" as a delimiter. I've tried multiple methods to import (wizard, infile, etc..) but I can't seem to get anything to run. I'm looking for some tips/suggestions on how to efficiently import this file.
Are there other tools that are better suited for the job?
When you say "\001" is the delimiter, do you mean 0x01, the SOH (start of heading) character? I'm not sure the Import Data task can detect/support that.
When using the Import Data task in EG, be sure to click the Performance button on the first page, then check "Bypass the data cleansing process". This will save quite a bit of overhead that you probably don't need.
If you can mock up a small subset of the data file with a delimiter that EG does support, then you could use the EG task to "design" the import step, then modify a copy of the generated code to use the proper delimiter and point to the actual source file.
If your EG is local and your SAS server is remote, and you need the text file to get "moved" to the server for import, consider using the Copy Files task to perform that part. However, you'll need a good chunk of temp space to hold a 12GB file.
Chris
When you say "\001" is the delimiter, do you mean 0x01, the SOH (start of heading) character? I'm not sure the Import Data task can detect/support that.
When using the Import Data task in EG, be sure to click the Performance button on the first page, then check "Bypass the data cleansing process". This will save quite a bit of overhead that you probably don't need.
If you can mock up a small subset of the data file with a delimiter that EG does support, then you could use the EG task to "design" the import step, then modify a copy of the generated code to use the proper delimiter and point to the actual source file.
If your EG is local and your SAS server is remote, and you need the text file to get "moved" to the server for import, consider using the Copy Files task to perform that part. However, you'll need a good chunk of temp space to hold a 12GB file.
Chris
Thanks for the response Chris. Yes you're correct that my delimited is 0x01, the SOH (start of heading) character, sorry for the confusion. I attempted to import a small subset of my data using the import wizard, and you're correct in that the data task does not support that type of delimiter. On the infile statement is '01'x the delimiter I want to use in this case?
Yes, that should do it.
Fun fact: when EG creates a "clean, delimited version" of a raw text file for input, it uses the DEL character ('7F'x) to minimize the chance of conflicts with actual data content.
The Performance setting I described lets you skip that step -- usually a safe option when the incoming data is a clean, well-formed source.
Chris
Thanks Chris, your solution worked great. Only took about 5 minutes to import. Much appreciated.
I have a similar situation. I am able to get passed the import wizard and create a file in SAS EG. But, SAS EG opens the imported data set by default and spends hours trying to load the file. I want to merely export the data set to a saved SAS data set. So, I don't need to display it open. I've tinkered with the SAS EG options related to Data, but nothing is helping.
In EG's Tools->Options->Results->Results General, unchecking Automatically open data or results when generated should prevent the imported data set from being opened automatically.
Casey
Register today and join us virtually on June 16!
sasglobalforum.com | #SASGF
View now: on-demand content for SAS users
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.