BookmarkSubscribeRSS Feed
Dominik4
Fluorite | Level 6

Hi,

we are using WRS 4.4 and I have a table with 11.400.000 observations and 568 columns.

 

I hoped to increase report speed when adding an index. So I created one in meta data with for example column 3, column 5 and column 7. The index physically exists.

 

The report shows round about 10 - 15 columns in a list, sums up a few columns and has a where clause with 5 filter conditions (off course they contain the 3 columns of the index).

 

At the end the report does not load faster. I already tried to preorder the data before writing the table but that did not help. In map studio it is not visible if he uses the index or not.

 

Any Ideas how to speed up with an index?

16 REPLIES 16
Kurt_Bremser
Super User

An index will only benefit you if you extract a sufficiently small (think 10% or less) subset of your data. Otherwise, the overhead of reading the index file actually increases time.

Keep in mind that a SAS dataset is organized in pages, and that a whole page (128K) needs to be read anyway if a single observation from it is needed.

LinusH
Tourmaline | Level 20

It's been awhile since I worked with WRS and Information maps.

The key is to somehow capture the queries that the information map generates towards the Pooled WS server.

Then take those queries and run it outside WRS, with 

options msglevel=i;

As you might know it's important how the index is defined vs the query is written, e.g., the first column in the index must be part of the where clause.

If you wish to speed up the queries, you might want to consider moving the data to a SPDE libref for faster table scans, and slightly more advanced where clause optimizer.

 

Data never sleeps
Kurt_Bremser
Super User

Those 568 columns look suspicious to me. In all my professional work, I had datasets with such numbers only for regression analysis, but never for reporting.

Consider stripping down or restructuring your dataset (if there's data in structure, see Maxim 19).

Dominik4
Fluorite | Level 6

@Kurt_Bremser 

Thank you. Sorting Data and 3 (from over 500) columns in index should be much smaller that reading the data source.

Moreover you are right. Having 568 columns is rubbish and I'm already working on that.

 

@Madelyn_SAS 

Thank you. I will read it.

 

@LinusH 

renamed the data file to get an exception with the sql. Seems although I have 5 filters the sql does not contain a where clause. So this would be a reason why index is not working. Does it makes sense that he first load all data in a temp data set to filter afterwards??

 

LinusH
Tourmaline | Level 20

Again, I'm a litle rusty.

Not sure what you mean by creating a temporary table? You don't refer to the query cache...?

 

In case yo can't get an indexed search, you want to restructure your data, as @Kurt_Bremser points out. If you have several reports with with not much overlapping information, you could consider splitting your data set in to multiple.

 

For where clause, I'm not sure how the information map acts if the filters is only defined in WRS (the documentation leaves a lot to wish for when comes to query tuning). If it works for your use case you could try to define filters in the information map, and have them prompted in the WRS report?

 

Data never sleeps
Kurt_Bremser
Super User

Yes, the index file will be much smaller than the dataset file. But if the index doesnt't keep you from still reading most, if not all, of the dataset file itself, it only adds additional I/O. And if your "want" observations are scattered randomly across the dataset pages, this will happen even with a small subset.

If you also have the MDDB server at hand, consider creating and using a cube.

LinusH
Tourmaline | Level 20
I think that OLAP Server is the product that should be considered, MDDB server is a SAS V8 legacy construct.
Having Infomaps and WRS - it's likely OLAP Server is licensed as well with the EBIS offering.
Data never sleeps
Dominik4
Fluorite | Level 6

Thank you for your feedback. 

 

We also use the olap feature but in this case it is prefered to have a relational table.

 

Splitting Data is the idea for the future. I search for a fast solution while we are working on that.

 

@LinusH 

Based on your idea I created a chapter filter (hope it is the correct description, I dont have a English WRS) and now I can see that the SQL has an where clause. Unfortunately the where clause contains a strip command that prevents using the index. 

WHERE (STRIP( ( table0.myColumn ) )) IN ( '1D' )

 

Any ideas how to avoid that strip command? I guess then I reached by goal and he uses the index

Dominik4
Fluorite | Level 6
The data is stripped in dataset. The issue is that if the sql contains the strip command, the index is not used because he tries to transform again
LinusH
Tourmaline | Level 20
Sorry, I think we've reached my WRS knowledge limit...
I suggest you open a track to SAS tech support to get som detailed insight on the "WRS engine".
Data never sleeps

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 16 replies
  • 6338 views
  • 3 likes
  • 4 in conversation