BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Tom
Super User Tom
Super User

@Season wrote:
So the selection process still takes place after the entire importation is done, right?

A text file is linear. There is no way to read it without actually reading it.

Season
Barite | Level 11

Thank you! I consulted Deepseek on resolving this issue in R and it provided a "flowing decompression" method of dealing with this problem. In short, batches of observations are decompressed, imported, selected and stored. When one cycle finishes, a second batch is decompressed while the first batch of decompressed file is deleted, and so on. The stored observations, which is what we finally want yet is distributed as multiple small datasets for the time being, is stacked to form a large one. Can SAS do something like this?

Tom
Super User Tom
Super User

@Season wrote:

Thank you! I consulted Deepseek on resolving this issue in R and it provided a "flowing decompression" method of dealing with this problem. In short, batches of observations are decompressed, imported, selected and stored. When one cycle finishes, a second batch is decompressed while the first batch of decompressed file is deleted, and so on. The stored observations, which is what we finally want yet is distributed as multiple small datasets for the time being, is stacked to form a large one. Can SAS do something like this?


Why would you want to?  SAS does NOT load the whole dataset into memory to work with it, like the original base R does with variables (objects as R calls them).  So no tricks to make it use less memory is typically needed when working in SAS.

Season
Barite | Level 11

Because loading in the entire dataset is too large. I understand that the importation process might not need all of the file to be loaded into memory in SAS, but the question is the resultant imported dataset is too large to be stored in memory as well.

Tom
Super User Tom
Super User

@Season wrote:

Because loading in the entire dataset is too large. I understand that the importation process might not need all of the file to be loaded into memory in SAS, but the question is the resultant imported dataset is too large to be stored in memory as well.


SAS stores datasets on disk, not in memory.  So large amounts of memory are not needed to work with datasets.  Especially one that only has 40 variables.  The only place you will have memory issues would be if you tried to do analysis that resulted in creating matrices that were too large to store in memory.  For example trying to using CLASS variable with millions of distinct classes.

 

Saving such a large dataset on disk might be any issue however.  The SAS dataset structure is not that efficient but using the COMPRESS=YES option can make them take a little less disk space. 

 

 

 

Season
Barite | Level 11

Thank you for your patient illustration! Could you please tell me where to specify the COMPRESS=YES option?

Tom
Super User Tom
Super User

You set the system option using the OPTIONS statement.

options compress=yes;

You set it at the LIBREF level using the COMPRESS= option of the LIBNAME statement.

libname mylib 'myfolder_name' compress=yes;

You can set it at the DATASET level using the COMPRESS= dataset option.

data mylib.myds(compress=yes);
  infile .....
Season
Barite | Level 11

Is it possible, then, to specify the starting and ending row of the .csv.gz file and let SAS read in the designated subset of data only?

Tom
Super User Tom
Super User

@Season wrote:

Is it possible, then, to specify the starting and ending row of the .csv.gz file and let SAS read in the designated subset of data only?


Yes.  Setting the starting observation number (really the starting LINE number) you have already seen in the example INFILE statements posted above.  To tell where to stop use the OBS= option of the INFILE statement.  

 

So to read the first 100 lines of actual data you would use FIRSTOBS=2 and OBS=101 (skipping the header line).

Season
Barite | Level 11
Thank you so much for your very informative and helpful reply! So my question can now be resolved by a macro.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 24 replies
  • 1194 views
  • 16 likes
  • 4 in conversation