Help using Base SAS procedures

Does order of the variables in a composite index matter?

Reply
Occasional Contributor
Posts: 10

Does order of the variables in a composite index matter?

Does the order of the data in a composite index matter? I have a composite index of an 8 character key + 4 character year. The year can only be one of 20 past years, the key can be any combination of letters and numbers. If my data has 100 million rows, approximately 5 million for each year, does it matter if I create a composite index as key+year or year+key?

 

I am wondering if the subsetting of the data by year would make the index (year+key) a bit faster than the other way around.

Super User
Posts: 6,757

Re: Does order of the variables in a composite index matter?

Posted in reply to set_all__

Is your data sorted?  Usually you would try to mimic the sorted order of the data as the primary key.  (It also depends on the patterns of records that you want to retrieve.)  Here's why.

 

When retrieving a record using an index, SAS doesn't retrieve data directly from the source data set.  Rather, it moves a block of records (that includes the next one you want) into memory.  From that block, it retrieves the next record that you want.

 

The major cost is moving the block of data that contains the next desired record.  Retrieving an individual record is a minimal cost.  So minimize the movement of blocks of records.

Occasional Contributor
Posts: 10

Re: Does order of the variables in a composite index matter?

Posted in reply to Astounding

The data is not sorted. It could possibly take too much time to sort the 100 million records before I need to kick off the next process in a short monthly production window. However, I am testing this out. I am also going to check if separating this dataset into multiple datasets (by year/multiple years) would make it any faster. Thanks for the suggestion!

Trusted Advisor
Posts: 1,337

Re: Does order of the variables in a composite index matter?

Posted in reply to set_all__

Why not make 2 composite indices, and run some tests?

 

I suspect, as @Astounding suggested, that you should order the components of a composite index that most closely approximate the physical record order of the indexed data set.

 

Super User
Posts: 5,876

Re: Does order of the variables in a composite index matter?

Posted in reply to set_all__
If your key is unique within a year, you could even be fine with a single index.
What is your application? End users asking for a one key value at the time? And sub second response times?
If you are in a Base SAS libref now, consider SPDE instead - parallel IO and better index management and query plans.
Data never sleeps
Ask a Question
Discussion stats
  • 4 replies
  • 108 views
  • 0 likes
  • 4 in conversation