Calcite | Level 5

## how to multiple the top 5 data by ID

Hello,

I want to multiple the top five return(rtn) by id, but I don't know how to do. I need to calculate each id's BHR=(1+ro)(1+r1)(1+r2)(1+r3)(1+r4)(1+r5)-1. The result would be 1101 BHR, 1102 BHR, and so on. Please give me some suggestions. Thank you!

 id rtn dif_day 1101 r0 0 1101 r1 1 … … … 1101 r251 251 1102 r0 0 1102 r1 1 … … 1102 r251 251 1103 r0 0 1103 r1 1 … … 1103 r251 251 …
1 ACCEPTED SOLUTION

Accepted Solutions
Rhodochrosite | Level 12

## Re: how to multiple the top 5 data by ID

We can take care of DIF_DAY and revise my old program. We just use a WHERE= condition to filter observations for DIF_DAY in

0 to 4 and do the program with the data provided.

``````data have;
input ide date yymmdd10. dif_DAY  rtn;
datalines;
1101 2009/8/26   0  -0.01729
1101 2009/8/27   1   0.001466
1101 2009/8/28   2   0.021962
1101 2009/8/31   3   0.022923
1101 2009/9/01   4   0.001401
1101 2009/9/02   5  -0.00699
1101 2009/9/03   6   0
1101 2009/9/04   7  -0.00282
1101 2009/9/07   8   0.011299
1101 2012/11/09  0  -0.00921
1101 2012/11/12  1   0
1101 2012/11/13  2  -0.01859
1101 2012/11/14  3  -0.00406
1101 2012/11/15  4  -0.01223
1101 2012/11/16  5   0.001376
1101 2012/11/19  6   0.005495
1101 2012/11/20  7   0.005464
1101 2012/11/21  8   0
1103 2014/1/23   0  -0.00643
1103 2014/1/24   1   0.006472
1103 2014/1/27   2  -0.00322
1103 2014/2/05   3  -0.00323
1103 2014/2/06   4   0.003236
1103 2014/2/07   5  -0.00323
1103 2014/2/10   6   0.003236
1103 2014/2/11   7   0.003226
;
run;

proc sort data = have;
by ide ;
run;

data want;
do until(last.ide);
set have (where = (dif_DAY between 0 and 4));
by ide;
if dif_DAY = 0 then BHR = 1;
BHR = BHR * (1 + rtn);
if dif_DAY = 4 then do; BHR = BHR - 1; output; end;
end;
drop dif_DAY rtn;
run;

proc print data = want ;
format date date10.;
run;``````
7 REPLIES 7
SAS Employee

## Re: how to multiple the top 5 data by ID

You will need to use by group processing to accomplish this. First sort the data by your id and dif_day variable then in a data step you can set the data set with a by id dif_day statement. This creates first.dif_day, first.id, last.dif_day and last.id whose values are 1 or 0 depending on the observation being processed. if the observation is the first of a particular value of the by variable the first. will be 1 otherwise it is 0. So you can use these variables to count which observation in the by group you are processing. Here is a simple example you can cut and paste:

data test;
do id=1101 to 1103;
do dif_day=0 to 20;
rtn=int(ranuni(0)*10);
output;
end;
end;
run;

proc sort data=test;
by id dif_day;
run;

data final;
set test;
by id;
array r(5) (5*0);
if first.id then do;
counter=0;
do i=1 to dim(r);
r(i)=0;
end;
end;
counter+1;
if counter le 5 then r(counter)=rtn;
if counter eq 5 then do;
total=1;
do i=1 to dim(r);
total=total * (1+r(i));
end;
total=total-1;
output;
end;
run;

Rhodochrosite | Level 12

## Re: how to multiple the top 5 data by ID

I do not see how dif_day is useful for your purpose. Here is an example to paly with.

First create a data set HAVE with r taking values in the range 0 to 1. Then sort it by ID. For every ID, use the first FIVE observations.

``````data have;
do id = 1101 to 1120;
do i = 1 to 10;
r = ranuni(123);
output;
end;
end;
drop i;
run;

proc sort data = have;
by id;
run;

data want;
do i = 1 by 1 until(last.id);
set have;
by id;
if first.id then BHR = 1;
BHR = BHR * (1 + r);
if i = 5 then do; BHR = BHR - 1; output; end;
else continue;
end;
drop i r;
run;

The OUTPUT:

id       BHR

1101     6.0477
1102     9.8697
1103     9.0726
1104     5.2849
1105     9.2560
1106    11.5767
1107     4.8835
1108     5.1041
1109     3.7164
1110     9.5704
1111    17.6294
1112     4.7647
1113    14.5187
1114     9.6550
1115     8.1160
1116     6.6383
1117     5.8073
1118     8.0927
1119     8.0730
1120     4.0700``````
Calcite | Level 5

## Re: how to multiple the top 5 data by ID

Hi Datasp,

I explained why I set dif_Day first. Take 1101(id) in my attached file as an example. This firm has two deals on different date, 2009/8/26 and 2012/11/9. And the dif_Day means the different between 2009/8/26 and 2009/8/27. However, your solution may only show 1101 in one time....Besides, I have a question about r=ranuni(123). I already have the real return(rtn), I think I should import my real dataset first?

I'm sorry that I am poor at sas...Please give me more suggestion, thank you.

I want to mutiple the rtn of dif_Day'0' to dif_Day'4' by each ide. Besides, some ide may have more than one start date and they cannot be deleted.

1101  BHR=(1+-0.01729)*(1+0.001466)*(1+0.021962)*(1+0.022923)*(1+0.001401)-1

1101  BHR=(1+-0.00921)*(1+0)*(1+-0.01859)*(1+0.-0.00406)*(1+0.-0.01223)-1

1103  BHR=

 ide date dif_DAY rtn 1101 2009/8/26 0 -0.01729 1101 2009/8/27 1 0.001466 1101 2009/8/28 2 0.021962 1101 2009/8/31 3 0.022923 1101 2009/9/1 4 0.001401 1101 2009/9/2 5 -0.00699 1101 2009/9/3 6 0 1101 2009/9/4 7 -0.00282 1101 2009/9/7 8 0.011299 … 1101 2012/11/9 0 -0.00921 1101 2012/11/12 1 0 1101 2012/11/13 2 -0.01859 1101 2012/11/14 3 -0.00406 1101 2012/11/15 4 -0.01223 1101 2012/11/16 5 0.001376 1101 2012/11/19 6 0.005495 1101 2012/11/20 7 0.005464 1101 2012/11/21 8 0 1103 2014/1/23 0 -0.00643 1103 2014/1/24 1 0.006472 1103 2014/1/27 2 -0.00322 1103 2014/2/5 3 -0.00323 1103 2014/2/6 4 0.003236 1103 2014/2/7 5 -0.00323 1103 2014/2/10 6 0.003236 1103 2014/2/11 7 0.003226

Rhodochrosite | Level 12

## Re: how to multiple the top 5 data by ID

We can take care of DIF_DAY and revise my old program. We just use a WHERE= condition to filter observations for DIF_DAY in

0 to 4 and do the program with the data provided.

``````data have;
input ide date yymmdd10. dif_DAY  rtn;
datalines;
1101 2009/8/26   0  -0.01729
1101 2009/8/27   1   0.001466
1101 2009/8/28   2   0.021962
1101 2009/8/31   3   0.022923
1101 2009/9/01   4   0.001401
1101 2009/9/02   5  -0.00699
1101 2009/9/03   6   0
1101 2009/9/04   7  -0.00282
1101 2009/9/07   8   0.011299
1101 2012/11/09  0  -0.00921
1101 2012/11/12  1   0
1101 2012/11/13  2  -0.01859
1101 2012/11/14  3  -0.00406
1101 2012/11/15  4  -0.01223
1101 2012/11/16  5   0.001376
1101 2012/11/19  6   0.005495
1101 2012/11/20  7   0.005464
1101 2012/11/21  8   0
1103 2014/1/23   0  -0.00643
1103 2014/1/24   1   0.006472
1103 2014/1/27   2  -0.00322
1103 2014/2/05   3  -0.00323
1103 2014/2/06   4   0.003236
1103 2014/2/07   5  -0.00323
1103 2014/2/10   6   0.003236
1103 2014/2/11   7   0.003226
;
run;

proc sort data = have;
by ide ;
run;

data want;
do until(last.ide);
set have (where = (dif_DAY between 0 and 4));
by ide;
if dif_DAY = 0 then BHR = 1;
BHR = BHR * (1 + rtn);
if dif_DAY = 4 then do; BHR = BHR - 1; output; end;
end;
drop dif_DAY rtn;
run;

proc print data = want ;
format date date10.;
run;``````
Calcite | Level 5

## Re: how to multiple the top 5 data by ID

Thanks for your help! It took me a long time to solve, and it worked!
Rhodochrosite | Level 12

## Re: how to multiple the top 5 data by ID

Hi ... using your posted data ...

libname zz 'z:\rtn.xls';

proc sort data=zz.'sheet1\$'n out=zztop;
by ide date;
run;

data want (keep=ide bhr date);
bhr = 1;
do i = 1 by 1 until (last.ide or i eq 5);
set zztop;
by ide;
where dif_day le 4;
bhr = bhr * (1+rtn);
end;
bhr +(- 1);
run;

libname zz clear;

first 3 observations data set WANT ...
Obs     ide         date       bhr

1    1101    01SEP2009     0.030260
2    1101    15NOV2012    -0.043421
3    1103    06FEB2014    -0.003216

Calcite | Level 5

## Re: how to multiple the top 5 data by ID

Thank you :):):)
Discussion stats
• 7 replies
• 1498 views
• 0 likes
• 4 in conversation