Hi,
I have some preparation steps and like to keep these as view but problem occurred when I try to perform a sort (as one of the prep step). My sort step is critical so any idea how I can sort the view (ideally retaining it as a view)
proc sql;
create view Prep as
select ID
,Vars
from InputData;
quit;
proc sort data = Prep nodupkey;
by ID;
run;
Of course, if your data set is already sorted, that's easy:
data want / view=want;
set have;
by id;
if first.id;
run;
But if it isn't, hashing will solve this. Unfortunately, the syntax is beyond the limit of my hashing skills. But here's the idea:
data want / view=want;
load data into a hash table with ID as the key (and vars as the data), with instructions to ignore duplicates;
unload the data from the hash table into the data set;
run;
You would have to put the results into a data set:
proc sort data=prep nodupkey out=next_step;
by ID;
run;
Alternatively, you could modify the SQL view, adding UNIQUE to get rid of duplicates, and adding ORDER BY to get a sorted result. Along the lines of:
proc sql;
create view Prep as
select distinct ID, vars
from InputData
order by ID;
quit;
Thanks for your reply. I could have provided more detail. I was using the sort to enforce retention of the first record based on ID. In another word if i have different variable values against the ID i would only want to keep the first record of that ID. Distinct would retain all different occurences.
Of course, if your data set is already sorted, that's easy:
data want / view=want;
set have;
by id;
if first.id;
run;
But if it isn't, hashing will solve this. Unfortunately, the syntax is beyond the limit of my hashing skills. But here's the idea:
data want / view=want;
load data into a hash table with ID as the key (and vars as the data), with instructions to ignore duplicates;
unload the data from the hash table into the data set;
run;
Cool. Did use the first. approach but was hoping that there's some option or trick with proc sort. Something along the line of:
proc sort data= PrepData / view=PrepData... etc
Just looking to expand my 'sort' horizon.
What about a UNIQUE clause in your SELECT SQL statement right up front?
SELECT UNIQUE id ...
You could also do something simple like this :
proc sort data = VIEWNAME out = sorted ; ***I think the VIEW name can be inserted directly here???**;
by ID someother_var ;
data select_unique ;
set sorted ;
by ID someother_var ;
if first.ID ;
This will select a unique value for ID based on the sorted value of some other variable(s). The most obvious example would be to sort by date. Selecting the first observation will automatically select the most recent date as and the observation will be unique for ID in the resulting view or dataset.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.