BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
DucatiJong
Calcite | Level 5

Hi,

 

I have some preparation steps and like to keep these as view but problem occurred when I try to perform a sort (as one of the prep step). My sort step is critical so any idea how I can sort the view (ideally retaining it as a view)

 

proc sql;

  create view Prep as

  select ID

             ,Vars

  from InputData;

  quit;

 

proc sort data = Prep nodupkey;

  by ID;

  run;

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

Of course, if your data set is already sorted, that's easy:

 

data want / view=want;

set have;

by id;

if first.id;

run;

 

But if it isn't, hashing will solve this.  Unfortunately, the syntax is beyond the limit of my hashing skills.  But here's the idea:

 

data want / view=want;

load data into a hash table with ID as the key (and vars as the data), with instructions to ignore duplicates;

unload the data from the hash table into the data set;

run;

View solution in original post

5 REPLIES 5
Astounding
PROC Star

You would have to put the results into a data set:

 

proc sort data=prep nodupkey out=next_step;

by ID;

run;

 

Alternatively, you could modify the SQL view, adding UNIQUE to get rid of duplicates, and adding ORDER BY to get a sorted result.  Along the lines of:

 

proc sql;

create view Prep as

select distinct ID, vars

from InputData

order by ID;

quit;

DucatiJong
Calcite | Level 5

Thanks for your reply. I could have provided more detail. I was using the sort to enforce retention of the first record based on ID. In another word if i have different variable values against the ID i would only want to keep the first record of that ID. Distinct would retain all different occurences.

Astounding
PROC Star

Of course, if your data set is already sorted, that's easy:

 

data want / view=want;

set have;

by id;

if first.id;

run;

 

But if it isn't, hashing will solve this.  Unfortunately, the syntax is beyond the limit of my hashing skills.  But here's the idea:

 

data want / view=want;

load data into a hash table with ID as the key (and vars as the data), with instructions to ignore duplicates;

unload the data from the hash table into the data set;

run;

DucatiJong
Calcite | Level 5

Cool. Did use the first. approach but was hoping that there's some option or trick with proc sort. Something along the line of:

 

proc sort data= PrepData / view=PrepData... etc

 

Just looking to expand my 'sort' horizon. 

Kow
Obsidian | Level 7 Kow
Obsidian | Level 7

What about a UNIQUE clause in your SELECT SQL statement right up front?

 

SELECT UNIQUE id ...

 

 

You could also do something simple like this :

 

proc sort  data = VIEWNAME  out = sorted ; ***I think the VIEW name can be inserted directly here???**;

      by ID someother_var   ;

 

data select_unique  ;

    set sorted  ;

    by ID someother_var  ;

    if first.ID      ;

 

This will select a unique value for ID based on the sorted value of some other variable(s).  The most obvious example would be to sort by date. Selecting the first observation will automatically select the most recent date as and the observation will be unique for ID in the resulting view or dataset.

 

    

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 2901 views
  • 0 likes
  • 3 in conversation