Can we say that the main difference between a sorted sas data file and a indexed one is: the first is in a physical order, the second is in logical order?
I'm not familiar with the term 'logical' order when it comes to datasets, can you explain that?
Googling sort vs index resulted in this, which is true for SAS as well, minus the new table part. It does create a new table, but it can have the same name as the old table.
Sorting a table physically reorders data into a sequential order and outputs the results to a new ACL table. Indexing does not make any change to the underlying physical order of data. Instead, it creates a separate index file that references records in the active table, allowing direct access to the records in a sequential order rather than a physical order.
FAQ paper on Index
With "logical order" I mean that the SAS data file remains intact. It is ordered by its index file.
Yes. I think so. Once you make an index , you can use BY at anywhere , and no need to sort the table before BY .
Some other issues you might want to know about ...
As a general rule, it is extremely time-consuming for SAS to retrieve an entire data set in sorted order by using an index. Indexes are better suited for retrieving a small subset, rather than retrieving an entire data set.
When you sort a data set, SAS stores the sorted order. You can see that at the end of a PROC CONTENTS report. So if you were to run the exact same PROC SORT twice, SAS is smart enough to skip running the second one. There is also a SORTEDBY= data set option, which tells SAS that the data set is in order when it is created in some other way other than by PROC SORT. That has its complications, however, so don't use it blindly.
Changing a data set destroys the index. But in many cases, changing a data set preserves the sorted order of the observations so that future sorting may not be needed.
So there are other differences, but are they "main differences"? Beauty is in the eye of the beholder.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
