SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
mtakayesu1
Fluorite | Level 6

Hello,

 

I have a dataset that is sorted by only an ID variable.  However, I would like to re-sort the data ID and Date of BIRTH  and SEX in that order.

Can I use Proc SQL to store the resorted data in a new table ?  Or will the order by be used only to display the results?

 

I am interested in resorting and saving the data in a new table, so any hints will help.  I know it is easy to do this using Proc Sort in the Data Step

Thanks

 

Proc SQL;

  create table NEW as select * from OLD

  order by ID, DOB, SEX;

quit;

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

I wouldn't be too quick in assuming SQL ORDER BY produces the same result as PROC SORT.

 

Yes it will IF ... each combination of the sort variables has only one observation.

 

But if there is more than one observation with the same ID/DOB/SEX triplet, then you can't make that assumption.   Now PROC SORT has an option (EQUALS vs NOEQUALS) that gives you a little control.  The EQUALS option (which is the default) preserves the original order of observations with tied triplets.   NOEQUALS does not enforce that result (which theoretically might allow a bit more efficiency).   I do not believe that the ORDER BY clause in PROC SQL provides for honoring the EQUALS behavior, which would be necessary to guarantee identical results to PROC SORT

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

5 REPLIES 5
sjb1
Fluorite | Level 6

Yes! Your code will produce the outcome you are looking for. Dataset NEW will sorted by ID, DOB, SEX.

Patrick
Opal | Level 21

Can I use Proc SQL to store the resorted data in a new table ?  Or will the order by be used only to display the results?

SAS tables are sequential and the sort order is how data gets stored physically. That's different to most databases.

 

Whether you use the SQL syntax as shared or Proc Sort will have the same result. Afaik both procedures use the same "module" in the background for the actual sorting of a SAS table.

Proc Sort gives you more options like NODUPKEY or in cases where the source table is potentially already sorted the way you want it PRESORTED (which then will only physically sort the table if it isn't already sorted).

mkeintz
PROC Star

I wouldn't be too quick in assuming SQL ORDER BY produces the same result as PROC SORT.

 

Yes it will IF ... each combination of the sort variables has only one observation.

 

But if there is more than one observation with the same ID/DOB/SEX triplet, then you can't make that assumption.   Now PROC SORT has an option (EQUALS vs NOEQUALS) that gives you a little control.  The EQUALS option (which is the default) preserves the original order of observations with tied triplets.   NOEQUALS does not enforce that result (which theoretically might allow a bit more efficiency).   I do not believe that the ORDER BY clause in PROC SQL provides for honoring the EQUALS behavior, which would be necessary to guarantee identical results to PROC SORT

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
mtakayesu1
Fluorite | Level 6

Thank you for this response.  I have come across more than one observation with the same ID/DOB/SEX triplet sort key.  And you are correct.  Comparing the Proc SORT vs. PROC SQL, the PROC SQL method doesn't have the same EQUALS behavior as PROC SORT, so when I attempted to pick the first record in the sort key based on PROC SQL or PROC SORT I have noticed slight differences in output.  Yes you do get the same number of observations, but the distribution looks different.  

 

 

 

 

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1229 views
  • 4 likes
  • 5 in conversation