Hi Everyone,
My data have ID, date, hour, value.
For each row, I I want look in the past 2 day and find the Top1 and Top2 value.
Since the number of hours within each days is not the same, I will not be able to use array with fix element.
In the data below, for row 6,7,9..., the value 999 will be the 1st largest and the value 55 will be the second largest.
Can you please help?
Thank you,
HHC
data have;
input id date hour value;
datalines;
1 1 1 5
1 1 2 4
1 1 3 55
1 2 1 9
1 2 3 7
1 2 6 999
1 3 9 9
1 3 10 8
1 3 10 8
1 4 1 5
1 4 2 18
1 5 10 6
2 1 2 4
2 1 3 6
;run;
data have; set have;
end_date = date+2;
run;
Just expand your data to have all of the hours in between.
Then normal rolling average methods will work.
If you don't have ETS licensed you can just use a simple data step (or even a data step view) to make them.
data have;
input id date hour value;
datalines;
1 1 1 5
1 1 2 4
1 1 3 55
1 2 1 9
1 2 3 7
1 2 6 999
1 3 9 9
1 3 10 8
1 3 10 8
1 4 1 5
1 4 2 18
1 5 10 6
2 1 2 4
2 1 3 6
;
data have48/view=have48;
do until (last.id);
set have ;
by id;
if first.id then do;
firstdate=date; firsthour=hour;
end;
end;
lastdate=date;
lasthour=hour;
date=firstdate;
hour=firsthour;
put (_all_) (=) ;
do until( lastdate <=date and lasthour<=hour) ;
output;
hour+1;
if hour>24 then do;
hour=1; date+1;
end;
end;
keep id date hour ;
run;
data want;
do until(last.id);
merge have48 have(in=in1);
by id date hour ;
array roll48 [48] ;
roll48[1+mod(date,2)*mod(hour-1,24)]=value;
if in1 then do;
big1 = largest(1, of roll48[*]);
big2 = largest(2, of roll48[*]);
output;
end;
end;
drop roll48: ;
run;
proc print;
run;
Obs id date hour value big1 big2 1 1 1 1 5 5 . 2 1 1 2 4 5 4 3 1 1 3 55 55 5 4 1 2 1 9 55 9 5 1 2 3 7 55 7 6 1 2 6 999 999 55 7 1 3 9 9 9 . 8 1 3 10 8 9 8 9 1 3 10 8 9 8 10 1 4 1 5 9 8 11 1 4 2 18 18 9 12 1 5 10 6 6 . 13 2 1 2 4 4 . 14 2 1 3 6 6 4
Use a two dimensional array (one for each day) and 24 hours wide to hold one value at most per hour. If you can have repeated values per hour, then the you'll need to increase the dimensions but still doable.
Post the expected results please.
Just expand your data to have all of the hours in between.
Then normal rolling average methods will work.
If you don't have ETS licensed you can just use a simple data step (or even a data step view) to make them.
data have;
input id date hour value;
datalines;
1 1 1 5
1 1 2 4
1 1 3 55
1 2 1 9
1 2 3 7
1 2 6 999
1 3 9 9
1 3 10 8
1 3 10 8
1 4 1 5
1 4 2 18
1 5 10 6
2 1 2 4
2 1 3 6
;
data have48/view=have48;
do until (last.id);
set have ;
by id;
if first.id then do;
firstdate=date; firsthour=hour;
end;
end;
lastdate=date;
lasthour=hour;
date=firstdate;
hour=firsthour;
put (_all_) (=) ;
do until( lastdate <=date and lasthour<=hour) ;
output;
hour+1;
if hour>24 then do;
hour=1; date+1;
end;
end;
keep id date hour ;
run;
data want;
do until(last.id);
merge have48 have(in=in1);
by id date hour ;
array roll48 [48] ;
roll48[1+mod(date,2)*mod(hour-1,24)]=value;
if in1 then do;
big1 = largest(1, of roll48[*]);
big2 = largest(2, of roll48[*]);
output;
end;
end;
drop roll48: ;
run;
proc print;
run;
Obs id date hour value big1 big2 1 1 1 1 5 5 . 2 1 1 2 4 5 4 3 1 1 3 55 55 5 4 1 2 1 9 55 9 5 1 2 3 7 55 7 6 1 2 6 999 999 55 7 1 3 9 9 9 . 8 1 3 10 8 9 8 9 1 3 10 8 9 8 10 1 4 1 5 9 8 11 1 4 2 18 18 9 12 1 5 10 6 6 . 13 2 1 2 4 4 . 14 2 1 3 6 6 4
Under the assumption that you've got max one data point per date and hour, you could still use a fixed array. The only additional bit of logic you need is to set array elements to missing for missing data points.
data inter;
set have;
by id date hour;
length largest_1 largest_2 8 _dh _l_dh 3;
array h48_ {48} 8 _temporary_;
/* array h48_ {48} 8; retain h48_;*/
_dh =date*24+hour;
_l_dh =lag(_dh);
if first.id then
do;
call missing(of h48_[*]);
h48_[mod(_dh,48)+1]=value;
end;
else
do;
do _i=_l_dh+1 to _dh-1;
call missing(h48_[mod(_i,48)+1]);
end;
h48_[mod(_dh,48)+1]=value;
end;
largest_1=largest(1,of h48_[*]);
largest_2=largest(2,of h48_[*]);
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.