BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
hhchenfx
Barite | Level 11

Hi Everyone,

My data has date, time, ID and value.

For each row, say date=1, time =12 and ID = 1, I want to find the 1st highest, 2nd highest and 3rd highest values for that id, of that date, and up to that time (<12).

The illustration of that row at the end.

Can you please help me?

Thank you,

HHC

data have;
  input date time ID value;
cards;
0 1 1 88888
0 1 9 585
1 1 1 999
1 2 1 7
1 3 1 3
1 4 1 101
1 7 1 102
1 9 1 12
1 12 1 9999
1 13 1 105
1 14 1 1
2 12 1 5
2 13 1 8 
1 2 124
1 2 11
1 2 9999
1 2 9
;


data temp; set have;
if id=1 and date=1 and       time<12 ;run;

proc sort data=temp; by id descending value;run;

*pick to 3;
data temp; set temp;
if _N_<=3;
run;

*spread horizontally;
proc transpose data=temp out=want (drop = _NAME_);
by date id;
var value;run;

data want; set want;
rename col1=top1 col2=top2 col3=top3;
time=12;run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

My first thought about this is to use an ordered hash object, but actually you can more neatly combine a RETAIN statement with a CALL SORTN:

 

data have;
  input date time ID value  ;
cards;
0 1 1 88888
0 1 9 585  
1 1 1 999  
1 2 1 7    
1 3 1 3    
1 4 1 101  
1 7 1 102  
1 9 1 12   
1 12 1 9999
1 13 1 105 
1 14 1 1   
2 12 1 5   
2 13 1 8   
1 2 124    
1 2 11     
1 2 9999   
1 2 9      
run;

proc sort ; 
  by id date time ;
run;

data want (drop=_:);
  set have;
  by id date;
  retain topvalue1-topvalue3  _dummy ;

  array sortvars _dummy topvalue3 topvalue2 topvalue1 ;

  if first.date then call missing(of sortvars{*});
  call sortn(of sortvars{*});
  output;
  _dummy=value;
run;

Reset the array sortvars (containing a history of VALUEs) to missing at the start of each date.

 

The call sortn routine will put the highest value in that history at the rightmost element of the sortvars array  (i.e. TOPVALUE1).  That's why the last three elements of sortvars array are in "reverse" order.

 

After sorting, output the observations.  Only after that, replace the 4th largest historic value (i.e. variable _DUMMY)   If the new value will be part of the top 3, that will be determined by the call sortn during the next observation.

 

I list the variables in the RETAIN statement in desired order, and place the RETAIN prior to the ARRAY statement (in reverse order).  The variable order of the first statement will prevail in the output dataset, even though the call sortn uses them in another order.

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

5 REPLIES 5
mkeintz
PROC Star

My first thought about this is to use an ordered hash object, but actually you can more neatly combine a RETAIN statement with a CALL SORTN:

 

data have;
  input date time ID value  ;
cards;
0 1 1 88888
0 1 9 585  
1 1 1 999  
1 2 1 7    
1 3 1 3    
1 4 1 101  
1 7 1 102  
1 9 1 12   
1 12 1 9999
1 13 1 105 
1 14 1 1   
2 12 1 5   
2 13 1 8   
1 2 124    
1 2 11     
1 2 9999   
1 2 9      
run;

proc sort ; 
  by id date time ;
run;

data want (drop=_:);
  set have;
  by id date;
  retain topvalue1-topvalue3  _dummy ;

  array sortvars _dummy topvalue3 topvalue2 topvalue1 ;

  if first.date then call missing(of sortvars{*});
  call sortn(of sortvars{*});
  output;
  _dummy=value;
run;

Reset the array sortvars (containing a history of VALUEs) to missing at the start of each date.

 

The call sortn routine will put the highest value in that history at the rightmost element of the sortvars array  (i.e. TOPVALUE1).  That's why the last three elements of sortvars array are in "reverse" order.

 

After sorting, output the observations.  Only after that, replace the 4th largest historic value (i.e. variable _DUMMY)   If the new value will be part of the top 3, that will be determined by the call sortn during the next observation.

 

I list the variables in the RETAIN statement in desired order, and place the RETAIN prior to the ARRAY statement (in reverse order).  The variable order of the first statement will prevail in the output dataset, even though the call sortn uses them in another order.

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Tom
Super User Tom
Super User

Didn't you ask this question last week?

data have;
  input date time ID value;
cards;
0 1 1 88888
0 1 9 585
1 1 1 999
1 2 1 7
1 3 1 3
1 4 1 101
1 7 1 102
1 9 1 12
1 12 1 9999
1 13 1 105
1 14 1 1
2 12 1 5
2 13 1 8 
1 2 124 .
1 2 11 .
1 2 9999 .
1 2 9 .
;

proc sort;
  by date time;
run;

data want ;
  set have;
  by date time;
  array history [0:99] _temporary_;
  array big largest1-largest5 ;
  if first.date then call missing(of row history[*] big[*]);
  row+1;
  history[mod(row,100)]=value;
  do index=1 to dim(big);
    big[index]=largest(index,of history[*]);
  end;
  drop index;
run;

proc print;
run;
Obs    date    time      ID    value    largest1    largest2    largest3    largest4    largest5    row

  1      0       1        1    88888      88888          .           .           .           .        1
  2      0       1        9      585      88888        585           .           .           .        2
  3      1       1        1      999        999          .           .           .           .        1
  4      1       2        1        7        999          7           .           .           .        2
  5      1       2      124        .        999          7           .           .           .        3
  6      1       2       11        .        999          7           .           .           .        4
  7      1       2     9999        .        999          7           .           .           .        5
  8      1       2        9        .        999          7           .           .           .        6
  9      1       3        1        3        999          7           3           .           .        7
 10      1       4        1      101        999        101           7           3           .        8
 11      1       7        1      102        999        102         101           7           3        9
 12      1       9        1       12        999        102         101          12           7       10
 13      1      12        1     9999       9999        999         102         101          12       11
 14      1      13        1      105       9999        999         105         102         101       12
 15      1      14        1        1       9999        999         105         102         101       13
 16      2      12        1        5          5          .           .           .           .        1
 17      2      13        1        8          8          5           .           .           .        2
PeterClemmensen
Tourmaline | Level 20

@hhchenfx , it seems that you've asked this quested 2 times before, but with slightly different input data. 

 

In this version of the question, it seems that the difference is that the date variable is not strictly increasing for each observation? Am I correct to assume that your requirement is the same as previous, but that you want to maintain the original order?

 

Please be very specific. Makes it much easier to provide an accurate answer 🙂

hhchenfx
Barite | Level 11

Thank you all for helping.

These are amazing code.

HHC

hhchenfx
Barite | Level 11

@PeterClemmensen , thanks for reminding me.

I will try to make it as clear as I can.

Thanks,

HHC

 

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 450 views
  • 2 likes
  • 4 in conversation