BookmarkSubscribeRSS Feed
Inquisitive101
Fluorite | Level 6

outcome.png

Hi 

 

I'm trying to execute the below code to essentially compund 3 month return ..essentially rolling 3 month comp return for each ID but found a peculiar issue while calculating using lag3 function as you may note below. I've copied the output and also shown what the expected outcome should be for comp3_ret.  Could you kindly let me know where I'm missing out:

 

 data pras2;
input DAT1 mmddyy11. ID $ RET VAR1 VAR2 FLAG ;
cards;
12/31/2010 A 0.20 25 100 1
11/30/2010 A 0.20 15 100 1
01/31/2011 A 0.20 35 100 1
08/31/2010 A 0.20 35 100 1
10/31/2010 A 0.20 20 100 1
09/30/2010 A 0.20 35 100 1
12/31/2010 AA 0.25 25 100 1
11/30/2010 AA 0.25 15 100 1
01/31/2011 AA 0.25 35 100 1
08/31/2010 AA 0.25 35 100 1
10/31/2010 AA 0.25 20 100 1
09/30/2010 AA 0.25 35 100 1
;
run;

PROC SORT DATA = PRAS2;
BY ID DAT1;

data pras1;
set pras2;
by ID DAT1;
retain comp;
if first.id then comp = 1;
comp = comp*(1+ret);

/**** Trailing 3 month compounded return ****/
if (first.id eq 0) and (ID eq lag2(ID)) then comp3_ret = comp - 1 ;
if (first.ID eq 0) and (ID eq lag3(ID)) then comp3_ret = (comp/lag3(comp))-1 ;
run;

 

PROC PRINT DATA = PRAS1;
format DAT1 MMDDYY10. ;
RUN;

 

output : (see attached img)

 

 


outcome.png
5 REPLIES 5
Reeza
Super User

Create a counter that counts the records per ID. If counter is greater than 3 then do calculations. 

 

Also calculate LAG outside of the IF statement. 

LAG doesn't work the way you think it should, is the quick answer. 

 

Longer explanation 

http://support.sas.com/resources/papers/proceedings09/055-2009.pdf

 

Ksharp
Super User
Not sure what you are looking for.


 data pras2;
input DAT1 mmddyy11. ID $ RET VAR1 VAR2 FLAG ;
cards;
12/31/2010 A 0.20 25 100 1
11/30/2010 A 0.20 15 100 1
01/31/2011 A 0.20 35 100 1
08/31/2010 A 0.20 35 100 1
10/31/2010 A 0.20 20 100 1
09/30/2010 A 0.20 35 100 1
12/31/2010 AA 0.25 25 100 1
11/30/2010 AA 0.25 15 100 1
01/31/2011 AA 0.25 35 100 1
08/31/2010 AA 0.25 35 100 1
10/31/2010 AA 0.25 20 100 1
09/30/2010 AA 0.25 35 100 1
;
run;

PROC SORT DATA = PRAS2; 
BY ID DAT1;
data pras1;
set pras2;
by ID DAT1;
retain comp found comp3_ret;
if first.id then do;comp = 1; found=0;comp3_ret=.;end;
comp = comp*(1+ret);

/**** Trailing 3 month compounded return ****/
if (first.id eq 0) and (ID eq lag2(ID)) and not found then do;
 comp3_ret = comp - 1 ;found=1;
end;
drop found;
run;


Astounding
PROC Star

Trying to change as little as possible in the existing program, the DATA step would become:

 

data pras1;
set pras2;
by ID DAT1;

id_back2 = lag2(id);

id_back3 = lag3(id);


retain comp;
if first.id then comp = 1;
comp = comp*(1+ret);

comp_back3 = lag3(comp);

/**** Trailing 3 month compounded return ****/
if  (ID eq id_back2) then comp3_ret = comp - 1 ;
if (ID eq id_back3) then comp3_ret = (comp/comp_back3)-1 ;
run;

 

You don't need to check for first.ID=0.  If ID is equal to ID from a couple of records previously, that can only happen when first.ID=0.

Inquisitive101
Fluorite | Level 6

thanks; but the objective is to replicate this for long duration comp. returns like say 6, 9 & 12 month returns. So I was hesitant to declare that many lag variabels upfront using arrays, 

 

Meanwhile, got this sorted out using proc expand from another post

"convert ret=comp_ret_3 / transformout=(+1 log reverse movsum 3 exp -1 );  "

 

Thanks

 

 

 

mkeintz
PROC Star

PrOC EXPAND is definitely the way to go. But for those without SAS/ETS (and therefore without proc expand), here is a data step solution, with by groups, to get rolling returns of any size (9 in this case) without recourse to multiple lag functions.  Just a lag for the window size (lag9 here):

 

Notes:

  1.   A rolling cum return (NEED) is generated by multiplying by the current rolling return and dividing by the lag9 return (as long as the lag9 record is for the same id).
  2. The IFN function avoids the problem of having a lag function executed in an IF statement.  IFN evaluates both outcomes (arguments 2 and 3 of the function) regardless of the evaluation of the first argument.  This means the lag queue (argument 2 of IFN) is updated whether or not its results are used.  In other words, it  avoids the tedious lagging and conditional use of the lagged value.  I.e. it avoids this: 
       ret9=lag9(ret);
       if lag9(id)=id then divisor=1+ret9;
       else divisor=1;
       need=   (need+1)*(1+ret)/divisor - 1;

 

 

data want (drop=need);

  set have;

  by id;

  retain need ;

  if first.id then need=0;

  need =  (1+need)*(1+ret) / (1+ifn(lag9(id)=id,lag9(ret),0))  - 1;

  if lag9(id)=id then rolling_return9=need;

run;

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1669 views
  • 0 likes
  • 5 in conversation