BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
juanvg1972
Pyrite | Level 9

Hi I am using the sas option to load SAS tables in memory:

 

sasfile work.ventas load;

 

My dataset has 10 million rows is 2Gb and my RAM is 8Gb, but once loaded in memory I haven’t improve the performance of my proc s and data steps comparing with when the dataset is in disk.

I don’t know if this option (sasfile ... load)  is oriented to improve some types of steps or data management. My steps are proc freq, proc measn and typical data steps to calculate fields

Also perhaps I need a LASR server to work efficiently in memory with SAS. Can anybody help me?

Any advice will be greatly apreciated.

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

As a pure test of reading the dataset from memory, run a DATA _NULL_ from disk and again from the same dataset via sasfile:

 

data _null_;
  set work.ventas;
run;


sasfile work.ventas load;
data _null_;
  set work.ventas;
run;


 

 

and compare the timings of the data _null_ steps,

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

3 REPLIES 3
Patrick
Opal | Level 21

@juanvg1972

Proc Means will likely need to sort the data which is done multithreaded but using temporary intermediary data sets on disc (UTILLOC). Having your source table in memory will at best speed up read from source into this process - but depending on the inner workings of Proc Means this read operation isn't the "bottleneck" and though reading from memory won't improve performance at all.

 

Given above: If and when having data in memory improves performance depends on the processing details.

 

There are options like BUFNO which influence performance for read/write operations on disk and it can be worth to investigate if the default values are set optimal for your environment.

 

Here a paper which might be useful:

http://support.sas.com/resources/papers/proceedings09/333-2009.pdf 

 

mkeintz
PROC Star

As a pure test of reading the dataset from memory, run a DATA _NULL_ from disk and again from the same dataset via sasfile:

 

data _null_;
  set work.ventas;
run;


sasfile work.ventas load;
data _null_;
  set work.ventas;
run;


 

 

and compare the timings of the data _null_ steps,

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 4866 views
  • 3 likes
  • 4 in conversation