Hello,
I have a dataset that is sorted by only an ID variable. However, I would like to re-sort the data ID and Date of BIRTH and SEX in that order.
Can I use Proc SQL to store the resorted data in a new table ? Or will the order by be used only to display the results?
I am interested in resorting and saving the data in a new table, so any hints will help. I know it is easy to do this using Proc Sort in the Data Step
Thanks
Proc SQL;
create table NEW as select * from OLD
order by ID, DOB, SEX;
quit;
I wouldn't be too quick in assuming SQL ORDER BY produces the same result as PROC SORT.
Yes it will IF ... each combination of the sort variables has only one observation.
But if there is more than one observation with the same ID/DOB/SEX triplet, then you can't make that assumption. Now PROC SORT has an option (EQUALS vs NOEQUALS) that gives you a little control. The EQUALS option (which is the default) preserves the original order of observations with tied triplets. NOEQUALS does not enforce that result (which theoretically might allow a bit more efficiency). I do not believe that the ORDER BY clause in PROC SQL provides for honoring the EQUALS behavior, which would be necessary to guarantee identical results to PROC SORT
Yes! Your code will produce the outcome you are looking for. Dataset NEW will sorted by ID, DOB, SEX.
Can I use Proc SQL to store the resorted data in a new table ? Or will the order by be used only to display the results?
SAS tables are sequential and the sort order is how data gets stored physically. That's different to most databases.
Whether you use the SQL syntax as shared or Proc Sort will have the same result. Afaik both procedures use the same "module" in the background for the actual sorting of a SAS table.
Proc Sort gives you more options like NODUPKEY or in cases where the source table is potentially already sorted the way you want it PRESORTED (which then will only physically sort the table if it isn't already sorted).
I wouldn't be too quick in assuming SQL ORDER BY produces the same result as PROC SORT.
Yes it will IF ... each combination of the sort variables has only one observation.
But if there is more than one observation with the same ID/DOB/SEX triplet, then you can't make that assumption. Now PROC SORT has an option (EQUALS vs NOEQUALS) that gives you a little control. The EQUALS option (which is the default) preserves the original order of observations with tied triplets. NOEQUALS does not enforce that result (which theoretically might allow a bit more efficiency). I do not believe that the ORDER BY clause in PROC SQL provides for honoring the EQUALS behavior, which would be necessary to guarantee identical results to PROC SORT
Thank you for this response. I have come across more than one observation with the same ID/DOB/SEX triplet sort key. And you are correct. Comparing the Proc SORT vs. PROC SQL, the PROC SQL method doesn't have the same EQUALS behavior as PROC SORT, so when I attempted to pick the first record in the sort key based on PROC SQL or PROC SORT I have noticed slight differences in output. Yes you do get the same number of observations, but the distribution looks different.
This is mostly (*) equivalent to
proc sort
data=old
out=new
;
by id dob sex;
run;
*) see the other posts in the thread.
Catch the best of SAS Innovate 2025 — anytime, anywhere. Stream powerful keynotes, real-world demos, and game-changing insights from the world’s leading data and AI minds.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.