Curious how best to go about simple time weighting. Essentially, I have quasi-daily data that I need to weight based upon when in the year it occurred relative to all other data points for that year and ID where the first observation (temporally) receives a weight of 1/n and the last observation receives a weight of n/n or 1.
So for example, if a given ID-year has the following data points (and only these data points):
DataDate Var1
Apr101980 50
Apr111980 55
Apr171980 55
Apr191980 40
Apr241980 30
Then the weights need to be as follows:
DataDate Var1 Weight
Apr101980 50 1/5
Apr111980 55 2/5
Apr171980 55 3/5
Apr191980 40 4/5
Apr241980 30 5/5
Such that the weighted sum for the given ID for the year 1980 would be = 50*1/5 + 55*2/5 + 55*3/5 + 40*4/5 + 30*5/5 = 127.
The actual data set is quite large, includes thousands of IDs, 40+ years, and probably 200+ daily observations (not always the same per year or even per ID in the same year). Any help on where to start would be appreciated.
data have;
input DataDate :$10. Var1;
n+1;/*creating counter, you could also use n=_n_*/
cards;
Apr101980 50
Apr111980 55
Apr171980 55
Apr191980 40
Apr241980 30
;
proc sql;
create table want(drop=n) as
select *,var1*n/count(*) as weight
from have;
quit;
Wow, is it really that simple? Thanks novinosrin. I'll try it out and see if it works for my larger dataset.
Please show a sample output of what you want for your input so I could modify. @PaigeMiller has pointed out my mistake .
Also, please provide me better representative sample with ID var etc
data have;
input ID DataDate : $15. Var1;
cards;
1 Apr101980 50
1 Apr111980 55
1 Apr171980 55
1 Apr191980 40
1 Apr241980 30
1 Feb81981 40
1 Feb101981 40
1 Feb181981 40
1 Feb271981 40
2 March181981 30
2 June271981 27
2 Sept81981 20
;
data temp;
set have;
year=substr(DataDate,length(DataDate)-3);
/*in your real, date should be numeric, so you can do year=year(date)*/
run;
data want;
do n=1 by 1 until(last.year);
set temp;
by id year;
end;
do _n_=1 by 1 until (last.year);
set temp;
by id year;
want=Var1*_n_/n;
output;
end;
drop n;
run;
@novinosrin wrote:
data have; input DataDate :$10. Var1; n+1;/*creating counter, you could also use n=_n_*/ cards; Apr101980 50 Apr111980 55 Apr171980 55 Apr191980 40 Apr241980 30 ; proc sql; create table want(drop=n) as select *,var1*n/count(*) as weight from have; quit;
This fails if the data set contains multiple years.
assign a sequential number by year, then weighted sum, then divide by N
Assumes data is properly sorted and each record has the value for Year (not date as in the data you provided)
data have2;
set have;
by year;
if first.year then seq=0;
seq+1;
run;
proc summary nway data=have2;
class year;
var var1/weight=seq;
output out=have3 sum=sum_var1;
run;
data want;
set have3;
final_answer=sum_var1/_freq_;
run;
PROC SUMMARY provides a weighted sum
The data step following divided by the N to turn it into an average
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.