SORTED X INDEXED

Reply
New Contributor
Posts: 4

SORTED X INDEXED

Can we say that the main difference between a sorted sas data file and a indexed one is: the first is in a physical order, the second is in logical order?

Grand Advisor
Posts: 17,360

Re: SORTED X INDEXED

I'm not familiar with the term 'logical' order when it comes to datasets, can you explain that?

Googling sort vs index resulted in this, which is true for SAS as well, minus the new table part. It does create a new table, but it can have the same name as the old table.

Sorting versus indexing

Sorting a table physically reorders data into a sequential order and outputs the results to a new ACL table. Indexing does not make any change to the underlying physical order of data. Instead, it creates a separate index file that references records in the active table, allowing direct access to the records in a sequential order rather than a physical order.

Help - ACL 9.2.0

FAQ paper on Index

http://www2.sas.com/proceedings/sugi30/008-30.pdf

New Contributor
Posts: 4

Re: SORTED X INDEXED

With "logical order" I mean that the SAS data file remains intact. It is ordered by its index file.

Grand Advisor
Posts: 9,578

Re: SORTED X INDEXED

Yes. I think so. Once you make an index , you can use BY at anywhere , and no need to sort the table before BY .

Respected Advisor
Posts: 4,978

Re: SORTED X INDEXED

Some other issues you might want to know about ...

As a general rule, it is extremely time-consuming for SAS to retrieve an entire data set in sorted order by using an index.  Indexes are better suited for retrieving a small subset, rather than retrieving an entire data set.

When you sort a data set, SAS stores the sorted order.  You can see that at the end of a PROC CONTENTS report.  So if you were to run the exact same PROC SORT twice, SAS is smart enough to skip running the second one.  There is also a SORTEDBY= data set option, which tells SAS that the data set is in order when it is created in some other way other than by PROC SORT.  That has its complications, however, so don't use it blindly.

Changing a data set destroys the index.  But in many cases, changing a data set preserves the sorted order of the observations so that future sorting may not be needed.

So there are other differences, but are they "main differences"?  Beauty is in the eye of the beholder.

Ask a Question
Discussion stats
  • 4 replies
  • 194 views
  • 0 likes
  • 4 in conversation