BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
KachiM
Rhodochrosite | Level 12

ScottBass:

 

SQL vs Hash Objects:

Program efficiency comes when data set is very large almost reaching the resources-limit and/or the program runs several times a day. You may simply compare both the methods, see the time, both run time and CPU times and  Memory they take. Decide the choice.

 

Inventing new algorithm:

If there is a SAS way, we prefer to use and at the same time, watch time and memory used. If you invent a new algorithm which takes relatively less time and less memory, why not use your new way? Sometimes, a SAS function may be available but requires data to be transformed into a new form using Proc  SORT, Proc TRANSPOSE , ...It takes overhead. For instance, you have a data set of Patients, each Patient has measurements of blood sugar for 10 Visits, arranged into columns. You ask for  3 maximum Values and the corresponding Visit Numbers. There are SAS functions, SORTN(), LARGEST() and WHICHN(). You can not use these functions directly unless you transpose the columns or copied to a new array. In this circumstance, you can find a new User function to directly get statistics on SLICED Array.

 

Using More than one variable like STD:

Mark has already answered this. If it Arrays, it is possible to use parallel arrays to take care of additional variables. For, STD, have one

array for FREQ, one for SUM and one for CORRECTED SUM OF SQUARES.

KachiM
Rhodochrosite | Level 12

BerndSE:

You asked: 

 

do i need the "if 0 then set sample" from your code?

Consider:

 

      if 0 then set sample;
      declare hash h(dataset:"sample", ordered: 'Y');

 

When you are loading the Data set ( SAMPLE ) in Bulk into the HASH TABLE, you need

 

if 0 then set sample;

 

to IMPACT the Program Data Vector (PDV). If that statement is not there, SAS does not consult the metadata of the data set by itself and will not succeed in defining the hash table.

 

For curiosity, drop the statement, run the program. This is how you learn the hash objects !

 

BerndSE
Fluorite | Level 6

Last post!

I implemented a proper SQL solution and a Hash solution (for comparison + learning).

 

=> Using SQL:  All it takes is pretty much one "proc sql create table". Really easy to understand, however... performance...

=> Using Hash: First time implementation is difficult, but the performance is insane compared to SQL. For around 7,000 data, Hash needs less than a second, while proc SQL needs around 16 seconds. Having written this as a macro, it will be even easy+fast to loop through this macro and assess different ranges and the impact on the plot

 

Regarding the "if 0 then set sample". My code currently does not have this and the values are still correct. However, I understand what it is used for now and I suppose its better to include. I found another description in some pdf file on the internet:

"SAS needs to know about the variables that are present in a hash table before it can use them. You can either use a LENGTH statement to define the variables, their types and lengths, or simply use a SET statement like this one. This SET statement will be read at compile time, information about the variables in the datasets will be extracted, but the statement will never actually execute due to the “if 0” condition."

 

Thanks everyone!

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 17 replies
  • 3764 views
  • 6 likes
  • 9 in conversation