BookmarkSubscribeRSS Feed
Agent1592
Pyrite | Level 9

Dear SAS Community:

I have a list of people making predictions about the value of a product and I would like to compute the average and median prediction at the end of each month for all the predictions issued for the last 90 days. Also if there are multiple predictions from 1 person, I just want to keep the last one.

data have;
  infile datalines dsd truncover;
  input ProductID:$8. Date:YYMMDDN8. PersonID:Z6. Prediction:32.;
  format Date YYMMDDN8. PersonID Z6.;
datalines4;
87482X10,20140309,119962,16
87482X10,20140310,079092,16
87482X10,20140310,080474,16
87482X10,20140310,071182,16
87482X10,20140407,079092,16.5
87482X10,20140807,119962,17
87482X10,20140808,079092,17
87482X10,20141105,080474,17
87482X10,20141203,072481,16
87482X10,20150407,079092,17.5
87482X10,20150430,072481,17
87482X10,20150430,119962,18
87482X10,20150430,080474,18
87482X10,20150430,071182,18
87482X10,20150701,079092,20
87482X10,20150730,072481,18
87482X10,20150730,080474,20
87482X10,20150730,071182,19
87482X10,20150826,080474,19
87482X10,20150930,119962,17.5
87482X10,20151002,072481,17.5
87482X10,20151006,079092,19
87482X10,20151029,119962,18.5
87482X10,20151029,080474,20
87482X10,20151030,072481,18
87482X10,20151030,079092,19.5
87482X10,20151204,080474,21
87482X10,20151228,119962,19.5
87482X10,20151229,072481,19
87482X10,20151229,071182,21
87482X10,20160106,079092,21
87482X10,20160126,071182,16
87482X10,20160127,119962,20
87482X10,20160302,080474,21.5
87482X10,20160428,119962,22
87482X10,20160428,080474,22.5
87482X10,20160727,119962,23
87482X10,20160728,080474,24
;;;;

I know I can use procedure means but with this procedure but am not sure how to compute the forecast for the last 90 days at the end of the month. 

proc sort data=have; 
        by productid date value estimid prediction personID;
run;

data have1; set have;
by productid date value estimid prediction personID;
if last.personID;
run;

proc means data=have2 noprint;
        by productid  date;
        var /*prediction */ new_prediction;                         
        output out= average (drop=_type_ _freq_)   
        mean=average n=numpredictions;
run;

Thanks!

14 REPLIES 14
PeterClemmensen
Tourmaline | Level 20

If I understand you correctly, you want to calculate statistics on prediction data 90 days back for each "end of month" date that is present in your data? Am I correct?

Agent1592
Pyrite | Level 9

Yes that is correct. I would like to have monthly predictions which would mean predictions at the end of the month.

Reeza
Super User

I know I can use procedure means but with this procedure but am not sure how to compute the forecast for the last 90 days at the end of the month. 

 

Use a WHERE statement and run PROC MEANS twice, once to get the means of each month and then once for the 90 days. 

Or create your own custom format for 90 day periods and use the multi-label format option available in PROC MEANS. If it was a fixed quarter you could use the Quarter format for PROC MEANS. 

Agent1592
Pyrite | Level 9

Thanks could you please let me know the code with the where statement.

Here is the idea. For each product productid I would like to compute the average prediction for the different personID. So the average is not across time but across personIDs. Each person may have different forecasts over the last 90 days but I take the most recent forecast. Then using calendar month end dates I calculate the average over personIDs.

productidYYMMDDN8personIDprediction
11/1/180110
11/14/18029
11/15/180111
11/20/18028
11/22/18035
Average  8=average(11,8,5)
Reeza
Super User

@Agent1592 wrote:

Thanks could you please let me know the code with the where statement.

Here is the idea. For each product productid I would like to compute the average prediction for the different personID. So the average is not across time but across personIDs. Each person may have different forecasts over the last 90 days but I take the most recent forecast. Then using calendar month end dates I calculate the average over personIDs.

productid YYMMDDN8 personID prediction
1 1/1/18 01 10
1 1/14/18 02 9
1 1/15/18 01 11
1 1/20/18 02 8
1 1/22/18 03 5
Average     8=average(11,8,5)

How are you defining 90 days? From the last day in the data, from a specific time start? From the last three months?

Agent1592
Pyrite | Level 9

I want the average prediction as of lets say 1/31/2018 so I would take the last forecast if it is not older than 90 days before 1/31/2018 (i.e. before the end of the month).

Agent1592
Pyrite | Level 9
From the last 3 months. I want the average prediction as of lets say 1/31/2018 so I would take the last forecast if it is not older than 90 days before 1/31/2018 (i.e. before the end of the month).
MarkWik
Quartz | Level 8

@Agent1592 Show one proper complete example  of data and your computed value for one product defining how you computed the 90 days period. So far it's not clear.

Agent1592
Pyrite | Level 9
productiddatepersonIDpredictionEnd of MonthAverage
87482X102014030911996216  
87482X10201403107909216  
87482X10201403108047416  
87482X102014031071182163/31/201416
87482X10201404077909216.54/30/201416.13
    5/31/201416.13
    6/30/201416.13

The idea is to compute the 90 day average prediction  before the end of the month for each product id. S0 if we want to calculate the average for 04/30/2014 we take all the most recent predictions for each analyst as from 01/30/2014 to 04/30/2014. The average is 16.13. You see personID=79092 gave 2 predictions within this time but we take his most recent prediction of 16.5.

 

novinosrin
Tourmaline | Level 20

Hi @Agent1592 I'm afraid if you could get your focus into fine attention to details in your requirement, this problem is not that challenging subject to you giving us "well detailed requirement with neat with good representative samples of HAVE and WANT like @MarkWik suggested. 

Anyway, for me SAS coding is very therapeutic, and here is my solution. However, we may have to work interactively back and forth meaning if you can be responsive enough to help me understand your req to test effectively, I'm willing to help you with the changes that my current approach may require. 

data have;
input productid : $15.	date :yymmdd10.	personID : $15.	prediction;
format date yymmdd10.;
datalines;
87482X10	20140309	119962	16
87482X10	20140310	79092	16
87482X10	20140310	80474	16
87482X10	20140310	71182	16
87482X10	20140407	79092	16.5
;

/*Before the following step, you MUST filter with last.personid after sorting by ascending date to make sure you only take the last prediciton of personid */

data temp;
set have;
by productid;
k=lag(date);
if first.productid then _grp=1;
else if intck('month',k,date)>=1 then _grp+1;
drop k;
run;
data want;
if _n_=1 then do;
if 0 then set temp(rename=(date=_date prediction=_prediction));
dcl hash h(dataset: 'temp(rename=(date=_date prediction=_prediction))', ordered: 'a',multidata:'y');
  h.definekey('productid');
  h.definedata('_date','_prediction');
  h.definedone();
  end;
  array _temp(1000);
  call missing(of _temp(*));
do until(last.productid);
	set temp;
	by productid _grp;
	if last._grp then 
	do;
		_n=0;
		do while(h.do_over(key:productid) eq 0);
  			if intnx('week',date,-90)<=_date<=date then 
			do;
				_n+1;
				_temp(_n)=_prediction;
			end;
		end;
		_sum=sum(of _temp(*));
		average=divide(_sum,_n);
	end;
	output;
end;
drop  _:;
run;
Agent1592
Pyrite | Level 9

@novinosrin Thanks for your reply. Basically I am trying to create a monthly moving average at the end of the month for all the predictions made in the last 90 days before the end of the month. 

 

data want1; set want;
  format date date9.;
  MEND = intnx('month',date,1)-1;
  format MEND date9.;
  MEND_90=intnx('day',MEND,-90);
  format MEND_90 date9.;
run;

So the moving average will include all predictions but if there are multiple predictions I just need to take the last one (last.personID)

 

0<date<=MEND_90).

Is it possible to do this easier with a proc expand in SAS?

 

novinosrin
Tourmaline | Level 20

I have never really used SAS ETS module and its procs. Maybe you should post the question SAS forecasting and econometric board rather than programming board. An ETS expert might just help you out

It is quite easy to program however rather hard to maintain unless the person who asks help here is also at the same level to comprehend the programs.

Reeza
Super User

PROC EXPAND can't handle the multiple observations, you can first clean that up with TIMESERIES or a data step and then use PROC EXPAND. 

Reeza
Super User

https://communities.sas.com/t5/Base-SAS-Programming/standard-deviation-at-the-end-of-each-month-base...

 

This is 12 months and STD but you can modify it for 3 months or 90 days (they are different) and the Mean instead.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 14 replies
  • 3347 views
  • 0 likes
  • 5 in conversation