BookmarkSubscribeRSS Feed
JesusCamden
Calcite | Level 5

Background: My wife is in healthcare and heavily involved in research but her primary job is patient care with no professional training in data science. She has asked for and is finally getting SAS and access to MarketScan, a large healthcare database. The folks in charge of the data access seem to be a little hesitant about the whole process and they indicate that typical jobs take 20+ hours. They also seem to think we need a beefy computer to run the numbers effectively.

Questions:

What sort of computer should we request/build? Big SSD and lots of RAM seem obvious but what would a reasonable system be if you were building one? Is there an advantage to more cores for this process?

Is a desktop computer the most effective solution? I don't know enough about this but it sounds like we are venturing into territory where a server or cloud based option would be superior.

1 REPLY 1
ChrisNZ
Tourmaline | Level 20

Welcome to the SAS community!

 

You really are asking how long is a piece of string.

Since we dont know what you data looks like and what processes run, it's impossible to give a precise answer.

 

In general:
- Statistics calculations benefit from more CPU power
- Data manipulation benefit from faster disks
- Both *might* benefit from more RAM in some cases.

  For example, when sorting, if all the data fits in memory, the sort time is greatly reduced.

 

In general again, SAS processes are I/O-bound rather than CPU-bound.
To know what your process's bottleneck is, you should study the log.
Look at the steps with the longest elapse time first.
If CPU time > (real time * 0.90), the step is probably I/O-bound (multi-threading makes this conclusion not always true). [Edit: rephrased for -hopefully- more clarity]

 

In general still, the most important disk is the WORK disk.
- Making this storage area fast is important.
- You can split the load across two physical I/O subsystems, one for WORK and one for UTIL, rather than the default which is to use the same disk for both.

 

Other notes:
- To speed up disks, pooling several SSDs together (for example in a RAID 10 configuration for permanent data, or RAID 0 for WORK or UTIL) is a common and effective strategy.
- The cloud has the benefit of being able to reconfigure your hardware requirements much more easily than if you have to manage the hardware yourself. You have to work with the vendor's limitations though
- Unix machines have better CPUs and better IO subsystems than x86 systems.
- On x86 systems, my experience is that Linux performs better than Windows under heavy load.

 

[Further note]  The SAS licence is priced depending on the CPU(s). So having high-spec RAM and storage is free from a licencing viewpoint. On the other hand, get the fastest CPUs allowed by your SAS licence.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 280 views
  • 2 likes
  • 2 in conversation