Curious how best to go about simple time weighting. Essentially, I have quasi-daily data that I need to weight based upon when in the year it occurred relative to all other data points for that year and ID where the first observation (temporally) receives a weight of 1/n and the last observation receives a weight of n/n or 1.
So for example, if a given ID-year has the following data points (and only these data points):
DataDate Var1
Apr101980 50
Apr111980 55
Apr171980 55
Apr191980 40
Apr241980 30
Then the weights need to be as follows:
DataDate Var1 Weight
Apr101980 50 1/5
Apr111980 55 2/5
Apr171980 55 3/5
Apr191980 40 4/5
Apr241980 30 5/5
Such that the weighted sum for the given ID for the year 1980 would be = 50*1/5 + 55*2/5 + 55*3/5 + 40*4/5 + 30*5/5 = 127.
The actual data set is quite large, includes thousands of IDs, 40+ years, and probably 200+ daily observations (not always the same per year or even per ID in the same year). Any help on where to start would be appreciated.
data have;
input DataDate :$10. Var1;
n+1;/*creating counter, you could also use n=_n_*/
cards;
Apr101980 50
Apr111980 55
Apr171980 55
Apr191980 40
Apr241980 30
;
proc sql;
create table want(drop=n) as
select *,var1*n/count(*) as weight
from have;
quit;
Wow, is it really that simple? Thanks novinosrin. I'll try it out and see if it works for my larger dataset.
Please show a sample output of what you want for your input so I could modify. @PaigeMiller has pointed out my mistake .
Also, please provide me better representative sample with ID var etc
data have;
input ID DataDate : $15. Var1;
cards;
1 Apr101980 50
1 Apr111980 55
1 Apr171980 55
1 Apr191980 40
1 Apr241980 30
1 Feb81981 40
1 Feb101981 40
1 Feb181981 40
1 Feb271981 40
2 March181981 30
2 June271981 27
2 Sept81981 20
;
data temp;
set have;
year=substr(DataDate,length(DataDate)-3);
/*in your real, date should be numeric, so you can do year=year(date)*/
run;
data want;
do n=1 by 1 until(last.year);
set temp;
by id year;
end;
do _n_=1 by 1 until (last.year);
set temp;
by id year;
want=Var1*_n_/n;
output;
end;
drop n;
run;
@novinosrin wrote:
data have; input DataDate :$10. Var1; n+1;/*creating counter, you could also use n=_n_*/ cards; Apr101980 50 Apr111980 55 Apr171980 55 Apr191980 40 Apr241980 30 ; proc sql; create table want(drop=n) as select *,var1*n/count(*) as weight from have; quit;
This fails if the data set contains multiple years.
assign a sequential number by year, then weighted sum, then divide by N
Assumes data is properly sorted and each record has the value for Year (not date as in the data you provided)
data have2;
set have;
by year;
if first.year then seq=0;
seq+1;
run;
proc summary nway data=have2;
class year;
var var1/weight=seq;
output out=have3 sum=sum_var1;
run;
data want;
set have3;
final_answer=sum_var1/_freq_;
run;
PROC SUMMARY provides a weighted sum
The data step following divided by the N to turn it into an average
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.