BookmarkSubscribeRSS Feed
anuragraishines
Quartz | Level 8

Hi,

 

I have written a code which is processing 60 million records and taking around ~ 1.5 hrs to run.

 

There is a push for further reducing it to 15 mins. 

 

The data set has only text columns and there are 15 columns in the data set.

 

I need some expert advice whether SAS can process 60 M records in 15 mins or not.

 

 

Regards,

Anurag Rai

 

 

8 REPLIES 8
PeterClemmensen
Tourmaline | Level 20

It can. Show us your code if you want advise.

anuragraishines
Quartz | Level 8

hi, 

 

Thanks for the prompt reply!!!

 

There are about 30 input file which are acting as a data source to the code.

 

There are about 250 Business Rules which are kind of if else condition  created as different data sets than finally getting appended.

 

 

Regards,

Anurag Rai

 

Kurt_Bremser
Super User

Is the dataset stored with compress=yes? This is often a very big factor when dealing with character variables.

And show us the code in question.

 

To answer your question, the answer is "yes, surely". It is a matter of making your data structures and code efficient, and sizing the infrastructure (CPU, RAM, storage) correctly.

anuragraishines
Quartz | Level 8

hi, 

 

Thanks for the prompt reply!!!

 

No it is not stored by using compressed option.

 

Regards,

Anurag Rai

Kurt_Bremser
Super User

@anuragraishines wrote:

hi, 

 

Thanks for the prompt reply!!!

 

No it is not stored by using compressed option.

 

Regards,

Anurag Rai


Then you must test it with the compress option. This may reduce physical dataset file size significantly and therefore your I/O load.

Also run your steps with options fullstimer to get more in-depth diagnostic information. A large difference between CPU time and real time always points to the I/O subsystem.

ballardw
Super User

You may also want to include an environment description. If the data manipulation involves network storage/transmission then you may have network issues to consider. If the data is read/written to another database system you connection may have an impact.

 

 

Reeza
Super User
It's possible, but depends on your hardware (RAM, CORES, CPU/GPU) and software. Viya is in memory for example so it's faster than 15 minutes but you need a lot of space to load up 60 million records into RAM. So theoretically it's definitely possible. Without further information on what you're trying to do it's impossible to say if it'll work in your system.
anuragraishines
Quartz | Level 8

hi, 

 

Thanks for the prompt reply!!!

 

There are about 30 input file which are acting as a data source to the code.

 

There are about 250 Business Rules which are kind of if else condition  created as different data sets than finally getting appended.

 

And it is not Viya platform.

 

 

Regards,

Anurag Rai

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 1125 views
  • 0 likes
  • 5 in conversation