topic Re: Proc SQL vs Proc Sort/Set statement in SAS Procedures

topic Re: Proc SQL vs Proc Sort/Set statement in SAS Procedures https://communities.sas.com/t5/SAS-Procedures/Proc-SQL-vs-Proc-Sort-Set-statement/m-p/42063#M10907 since you have only 16M rows and 2 columns to de-duplicate, I could recommend a hash able of keys with the row-numbers of the preferred row(for re-loading the data), but you might get the logical equivalent with the TAGSORT option of proc sort - when sort work areas should be needed only for keys (and that tag). Wed, 01 Dec 2010 16:17:29 GMT Peter_C 2010-12-01T16:17:29Z Proc SQL vs Proc Sort/Set statement https://communities.sas.com/t5/SAS-Procedures/Proc-SQL-vs-Proc-Sort-Set-statement/m-p/42062#M10906 I have a dataset with 16 million records and 64 variables, 2 of which I am looking to use to subset the data. Of the two, call them x and y, x has duplicates and I am looking to choose the one record of each unique x based on which one has the highest value of y. I know I can do this with either proc sql, using a "group by" approach or to sort first and then use a set statement with first.y etc approach. My concern here is which approach is generally considered more efficient ? I used to run away from all sorts until I realized that sometimes, a Proc SQL approach could be equally time-consuming. Any insights would be greatly appreciated. Wed, 01 Dec 2010 12:40:23 GMT https://communities.sas.com/t5/SAS-Procedures/Proc-SQL-vs-Proc-Sort-Set-statement/m-p/42062#M10906 Elkridge_SAS 2010-12-01T12:40:23Z Re: Proc SQL vs Proc Sort/Set statement https://communities.sas.com/t5/SAS-Procedures/Proc-SQL-vs-Proc-Sort-Set-statement/m-p/42063#M10907 since you have only 16M rows and 2 columns to de-duplicate, I could recommend a hash able of keys with the row-numbers of the preferred row(for re-loading the data), but you might get the logical equivalent with the TAGSORT option of proc sort - when sort work areas should be needed only for keys (and that tag). Wed, 01 Dec 2010 16:17:29 GMT https://communities.sas.com/t5/SAS-Procedures/Proc-SQL-vs-Proc-Sort-Set-statement/m-p/42063#M10907 Peter_C 2010-12-01T16:17:29Z