BookmarkSubscribeRSS Feed
OzoneX15
Calcite | Level 5
Dear Sas users,

I have a difficult problem which I cannot solve.... I think to do it with an advanced merge but I still do not succeed. I have a database with transaction data. For example, here a mutual fund is trading in one stock (Isin code= BE0003565737). This is my input data in SAS. So this fund is trading (selling or buying this stock) on 19/02/2004, 19/03/2004, 24/03/2004,... and so on. But I want to know exactly how much the return is. So for example the first buy is on 19/02/2004 but the sell is on 11/08/2003. the problem is not the merging itself but the stocks it sells or buys. Because this fund is buying 3000 stocks and selling 3800 stocks.... I have to keep track the number of stocks when I do a merge (for this example, I sells 3000 stocks but it holds 800 stocks and this 800 stocks I have to merge with the following buy date... I want to come up with this table in SAS where all transactions are exactly mathcing (first in/first out). I hope I made myself clear... thanks in advance

You can have a look at the exel.. perhaps more clear...
http://users.telenet.be/webdesignsite/Problem_Stefaan_Pauwels.zip

I have tried to create two tables (exactly copies of the table beneath), so exactly two tables and to join them but still don't know how to take into account the number of stocks)

tdate fund ISIN Price transmod NumberStocks
19/02/2004 100174701 BE0003565737 47 buy 3000
19/03/2004 100174701 BE0003565737 44,8 buy 3800
24/03/2004 100174701 BE0003565737 44,59 buy 12050
23/06/2005 100174701 BE0003565737 66,65 buy 3900
11/11/2005 100174701 BE0003565737 73,15 buy 3000
14/11/2005 100174701 BE0003565737 73,2 buy 3800
17/05/2006 100174701 BE0003565737 84,3 buy 250
12/06/2006 100174701 BE0003565737 78,05 buy 800
11/08/2003 100174701 BE0003565737 87,85 sell 40000
11/08/2006 100174701 BE0003565737 87,85 sell 3800
21/09/2006 100174701 BE0003565737 84,75 buy 530
23/11/2006 100174701 BE0003565737 85,3 buy 30
4/12/2006 100174701 BE0003565737 85,95 buy 20
23/01/2007 100174701 BE0003565737 97,65 buy 260
13/02/2007 100174701 BE0003565737 97,8 sell 400
23/02/2007 100174701 BE0003565737 96,2 buy 150
6/03/2007 100174701 BE0003565737 89,67 sell 300
6/03/2007 100174701 BE0003565737 89,67 sell 40000


This is the final output table I want to come up with...

tdate fund ISIN Price transmod NumberStocks tdate_sell fund1 ISIN1 Price1 transmod1 NumberStocks1
19/02/2004 100174701 BE0003565737 47 buy 3000 11/08/2006 100174701 BE0003565737 87,85 sell 3000
19/03/2004 100174701 BE0003565737 44,8 buy 3800 11/08/2006 100174701 BE0003565737 87,85 sell 800
19/03/2004 100174701 BE0003565737 44,8 buy 3000 13/02/2007 100174701 BE0003565737 97,8 sell 400
19/03/2004 100174701 BE0003565737 44,8 buy 2600 6/03/2007 100174701 BE0003565737 89,67 sell 300
19/03/2004 100174701 BE0003565737 44,8 buy 2300 6/03/2007 100174701 BE0003565737 89,67 sell 2300
24/03/2004 100174701 BE0003565737 44,59 buy 12050 6/03/2007 100174701 BE0003565737 89,67 sell 37700
23/06/2005 100174701 BE0003565737 66,65 buy 3900 6/03/2007 100174701 BE0003565737 89,67 sell 33800

Thanks in advance,

Stefaan

Message was edited by: OzoneX15 Message was edited by: OzoneX15
9 REPLIES 9
SPR
Quartz | Level 8 SPR
Quartz | Level 8
Hello OzoneX15,

This is my understanding of your request (correct me if I am wrong):
[pre]
data i;
input tdate DDMMYY10. fund ISIN $ Price $ transmod $ NumberStocks;
format tdate date7.;
datalines;
19/02/2004 100174701 BE0003565737 47 buy 3000
19/03/2004 100174701 BE0003565737 44,8 buy 3800
24/03/2004 100174701 BE0003565737 44,59 buy 12050
23/06/2005 100174701 BE0003565737 66,65 buy 3900
11/11/2005 100174701 BE0003565737 73,15 buy 3000
14/11/2005 100174701 BE0003565737 73,2 buy 3800
17/05/2006 100174701 BE0003565737 84,3 buy 250
12/06/2006 100174701 BE0003565737 78,05 buy 800
11/08/2003 100174701 BE0003565737 87,85 sell 40000
11/08/2006 100174701 BE0003565737 87,85 sell 3800
21/09/2006 100174701 BE0003565737 84,75 buy 530
23/11/2006 100174701 BE0003565737 85,3 buy 30
4/12/2006 100174701 BE0003565737 85,95 buy 20
23/01/2007 100174701 BE0003565737 97,65 buy 260
13/02/2007 100174701 BE0003565737 97,8 sell 400
23/02/2007 100174701 BE0003565737 96,2 buy 150
6/03/2007 100174701 BE0003565737 89,67 sell 300
6/03/2007 100174701 BE0003565737 89,67 sell 40000
;
run;
proc sort data=i out=s;
by tdate fund ISIN;
run;
data t;
retain p;
set s;
if UPCASE(transmod)="SELL" and UPCASE(lag(transmod)) IN ("BUY","" ) then p+1;
if UPCASE(transmod)="BUY" and UPCASE(lag(transmod)) IN ("SELL","") then p+1;
by fund ISIN ;
run;
proc means data=t nway noprint;
output out=t1 (drop=_freq_ _type_) Sum=;
id tdate transmod;
class fund ISIN p;
var NumberStocks;
run;
data t2;
set t1;
if MOD(p,2)=0 then p1+1;
by fund ISIN;
run;
proc transpose data=t2 out=ts(drop=_name_);
id transmod;
var NumberStocks;
by fund ISIN p1;
run;
proc transpose data=t2 out=td(drop=_name_);
id transmod;
var tdate;
by fund ISIN p1;
run;
data r;
retain fund ISIN TDateSell StocksSell TDateBuy StocksBuy;
merge ts(rename=(Sell=StocksSell Buy=StocksBuy))
td(rename=(Sell=TDateSell Buy=TDateBuy));
by fund ISIN p1;
drop p1;
run;
[/pre]
Sincerely,
SPR
Ksharp
Super User
Oh My God!
It is the most complicated logic I have ever meeted.
After works for half a day.Only can get partly outcome. Hope this will help you a bit.
I will promote this code after. Actually I am very tired ,need some rest.


[pre]

data i;
input tdate : DDMMYY10. fund : $10. ISIN : $20. Price : commaX8. transmod $ numberstocks;
format tdate DDMMYY10. price commaX8.2;
datalines;
19/02/2004 100174701 BE0003565737 47 buy 3000
19/03/2004 100174701 BE0003565737 44,8 buy 3800
24/03/2004 100174701 BE0003565737 44,59 buy 12050
23/06/2005 100174701 BE0003565737 66,65 buy 3900
11/11/2005 100174701 BE0003565737 73,15 buy 3000
14/11/2005 100174701 BE0003565737 73,2 buy 3800
17/05/2006 100174701 BE0003565737 84,3 buy 250
12/06/2006 100174701 BE0003565737 78,05 buy 800
11/08/2003 100174701 BE0003565737 87,85 sell 40000
11/08/2006 100174701 BE0003565737 87,85 sell 3800
21/09/2006 100174701 BE0003565737 84,75 buy 530
23/11/2006 100174701 BE0003565737 85,3 buy 30
4/12/2006 100174701 BE0003565737 85,95 buy 20
23/01/2007 100174701 BE0003565737 97,65 buy 260
13/02/2007 100174701 BE0003565737 97,8 sell 400
23/02/2007 100174701 BE0003565737 96,2 buy 150
6/03/2007 100174701 BE0003565737 89,67 sell 300
6/03/2007 100174701 BE0003565737 89,67 sell 40000
;
run;
proc sort data=i ;
by tdate ;
run;
data i;
set i;
if _n_=1 then delete;
run;
data buy sell(rename=(tdate=_tdate fund=_fund isin=_isin price=_price transmod=_transmod numberstocks=_numberstocks));
set i;
if transmod eq 'buy' then output buy;
else if transmod eq 'sell' then output sell;
run;

data want;
b=1;s=1; sum=0;_sum=0;
set buy nobs=b_obs;set sell nobs=s_obs;


do i=1 to 100000; *10000 is whatever you want ,just to be enough bigger than number of obs.;
if numberstocks lt _numberstocks then do;
do until(_numberstocks lt sum);
set buy point=b ; b+1;
output;
sum=sum(sum,numberstocks);
end; s+1;
_numberstocks=_numberstocks - ( sum - numberstocks ); sum=0;
put _numberstocks= numberstocks= sum= b= s= b_obs= s_obs=;
if b ge b_obs then stop;
*if s ge s_obs then stop;
end;
else if numberstocks gt _numberstocks then do;
do until( numberstocks lt _sum );
set sell point=s ; s+1;
output;
_sum=sum(_sum,_numberstocks);
end; b+1;
numberstocks=numberstocks - ( _sum - _numberstocks ); _sum=0;
put _numberstocks= numberstocks= _sum= b= s= b_obs= s_obs=;
if s ge s_obs then stop;
*if b ge b_obs then stop;
end;
end;
run;

data op;
set want;
range=_numberstocks - numberstocks;
cul_sum+range;
dif=range-cul_sum;
run;


[/pre]




Ksharp
OzoneX15
Calcite | Level 5
MANY MANY thanks Ksharp and SPR for your effort!!!
I will try your codes and hopefully this will work for my dataset.

Sincelery,

Stefaan Pauwels
OzoneX15
Calcite | Level 5
KSharp, you are really close to the solution...
If you still can help me to find the whole solution.. I will try to work further on your code.. I really apreciate the effort you already made,

Kind regard,

Stefaan
Ksharp
Super User
Yes.I will give you some help.
I refined my code again.

The following code is outcome I have worked a whole day ,I think I am very tired.
I have get the result looks like yours.
But due to your partial data (not complete) and complicated logic, so this code is only reference.

If you have some other problem ,can post it.

Good Luck.


[pre]
data i;
input tdate : DDMMYY10. fund : $10. ISIN : $20. Price : commaX8. transmod $ numberstocks;
format tdate DDMMYY10. price commaX8.2;
datalines;
19/02/2004 100174701 BE0003565737 47 buy 3000
19/03/2004 100174701 BE0003565737 44,8 buy 3800
24/03/2004 100174701 BE0003565737 44,59 buy 12050
23/06/2005 100174701 BE0003565737 66,65 buy 3900
11/11/2005 100174701 BE0003565737 73,15 buy 3000
14/11/2005 100174701 BE0003565737 73,2 buy 3800
17/05/2006 100174701 BE0003565737 84,3 buy 250
12/06/2006 100174701 BE0003565737 78,05 buy 800
11/08/2003 100174701 BE0003565737 87,85 sell 40000
11/08/2006 100174701 BE0003565737 87,85 sell 3800
21/09/2006 100174701 BE0003565737 84,75 buy 530
23/11/2006 100174701 BE0003565737 85,3 buy 30
4/12/2006 100174701 BE0003565737 85,95 buy 20
23/01/2007 100174701 BE0003565737 97,65 buy 260
13/02/2007 100174701 BE0003565737 97,8 sell 400
23/02/2007 100174701 BE0003565737 96,2 buy 150
6/03/2007 100174701 BE0003565737 89,67 sell 300
6/03/2007 100174701 BE0003565737 89,67 sell 40000
;
run;
proc sort data=i ;
by tdate ;
run;
data i;
set i;
if _n_=1 then delete;
run;
data buy sell(rename=(tdate=_tdate fund=_fund isin=_isin price=_price transmod=_transmod numberstocks=_numberstocks));
set i;
if transmod eq 'buy' then output buy;
else if transmod eq 'sell' then output sell;
run;

data want;
b=1;s=1;
set buy nobs=b_obs;
set sell nobs=s_obs;
sum+numberstocks; _sum+_numberstocks;b_flag=0;s_flag=1;
output;

do while(b lt b_obs or s lt s_obs);

do while(sum lt _sum);
b+1;
set buy point=b;
sum+numberstocks;b_flag=0; s_flag=1;
output;
if b ge b_obs and s ge s_obs then stop;
end;

do while(sum gt _sum);
s+1;
set sell point=s;
_sum+_numberstocks;b_flag=1; s_flag=0;
output;
if b ge b_obs and s ge s_obs then stop;
end;

end;

run;


data op;
set want;
dif_sum=_sum - sum;
run;

data opp;
set op;
lag_dif_sum=abs(lag(dif_sum));
drop sum _sum dif_sum ;
run;

data temp;
set opp;
if _n_ eq 1 then _numberstocks = numberstocks;
else do;
if b_flag =0 and s_flag=1 then _numberstocks=lag_dif_sum;
else if b_flag=1 and s_flag=0 then numberstocks=lag_dif_sum;
end;
run;

data result;
set temp;
if b_flag=1 and s_flag=0 and numberstocks lt _numberstocks then _numberstocks =numberstocks;
drop b_flag s_flag lag_dif_sum;
run;
proc print noobs;run;
[/pre]




Ksharp
chang_y_chung_hotmail_com
Obsidian | Level 7
I don't think this is complicated at all.

All you have to do is match the number of shares bought and sold in order.
The sample data, however, are not adequate since there seems to be at least one incorrect date
and there are sells of shares that weren't bought before.
The latter makes the output invalid. GIGO!



   /* test data. as OzoneX15 posted except the date 11/08/2003 is 


      manually corrected to 11/08/2006 */


   data one;


     input tdate :ddmmyy10. fund isin :$12. price :eurox5. 


       transmod :$4. numberStocks;


   cards


   19/02/2004 100174701 BE0003565737 47    buy  3000


   19/03/2004 100174701 BE0003565737 44,8  buy  3800


   24/03/2004 100174701 BE0003565737 44,59 buy  12050


   23/06/2005 100174701 BE0003565737 66,65 buy  3900


   11/11/2005 100174701 BE0003565737 73,15 buy  3000


   14/11/2005 100174701 BE0003565737 73,2  buy  3800


   17/05/2006 100174701 BE0003565737 84,3  buy  250


   12/06/2006 100174701 BE0003565737 78,05 buy  800


   11/08/2006 100174701 BE0003565737 87,85 sell 40000


   11/08/2006 100174701 BE0003565737 87,85 sell 3800


   21/09/2006 100174701 BE0003565737 84,75 buy  530


   23/11/2006 100174701 BE0003565737 85,3  buy  30


   4/12/2006  100174701 BE0003565737 85,95 buy  20


   23/01/2007 100174701 BE0003565737 97,65 buy  260


   13/02/2007 100174701 BE0003565737 97,8  sell 400


   23/02/2007 100174701 BE0003565737 96,2  buy  150


   6/03/2007  100174701 BE0003565737 89,67 sell 300


   6/03/2007  100174701 BE0003565737 89,67 sell 40000


   ;


   run;


 


   /* assuming that the data are sorted by fund isin and tdate. 


      create buy-sell observations matching buys and sells on the


      first-in, first-out basis */


   data two;


     /* set up */ 


     length fund 8 isin $12. tid bdate bprice bshare 8


       share sdate sprice sshare 8;


     dcl hash buys(ordered:'a');


     buys.defineKey('tid');


     buys.defineData('bdate','bprice','bshare');


     buys.defineDone();


     dcl hash sells(ordered:'a');


     sells.defineKey('tid');


     sells.defineData('sdate','sprice','sshare');


     sells.defineDone();


     dcl hiter buy('buys');


     dcl hiter sell('sells'); 


 


     /* load the hashes with data */


     do until (last.isin);


       set one;


       by fund isin tdate;


       tid + 1;


       if transmod="buy" then do;


         bdate = tdate;


         bprice = price;


         bshare = numberStocks;


         buys.add();


       endelse do


         sdate = tdate;


         sprice = price;


         sshare = numberStocks;


         sells.add();


       end;


     end;


 


     /* prepare for output */


     keep fund isin bdate bprice share sdate sprice;


     format bdate sdate ddmmyy10. bprice sprice eurox5.2;


 


     /* output */


     brc = buy.first(); 


     src = sell.first();


     do while(brc = 0 and src = 0);   


       select;


         when (bshare lt sshare) do;


           temp = sshare - bshare;


           share = bshare;


           output;


           sshare = temp;


           brc = buy.next();


        end;


         when (bshare = sshare) do;


           share = bshare;


           output;


           brc = buy.next();


           src = sell.next();


         end;


         when (bshare gt sshare) do;


           temp = bshare - sshare;


           share = sshare;


           output;


           bshare = temp;


           src = sell.next();


         end


         otherwise


       end;


     end;


 


     /* just in case buy and sell volumns do not match */


     do while (brc = 0);


       call missing(sdate, sprice);


       share = bshare;


       output;


       brc = buy.next(); 


     end;


     do while (src = 0);


       call missing(bdate, bprice);


       share = sshare;


       output;


       src = sell.next();


     end;


   run;


 


   /* check */


   proc print data=two;


     var bdate bprice share sdate sprice;


   run;


   /* on lst


   Obs         bdate    bprice    share         sdate    sprice


 


     1    19/02/2004    47,00      3000    11/08/2006    87,85


     2    19/03/2004    44,80      3800    11/08/2006    87,85


     3    24/03/2004    44,59     12050    11/08/2006    87,85


     4    23/06/2005    66,65      3900    11/08/2006    87,85


     5    11/11/2005    73,15      3000    11/08/2006    87,85


     6    14/11/2005    73,20      3800    11/08/2006    87,85


     7    17/05/2006    84,30       250    11/08/2006    87,85


     8    12/06/2006    78,05       800    11/08/2006    87,85


     9    21/09/2006    84,75       530    11/08/2006    87,85


    10    23/11/2006    85,30        30    11/08/2006    87,85


    11    04/12/2006    85,95        20    11/08/2006    87,85


    12    23/01/2007    97,65       260    11/08/2006    87,85


    13    23/02/2007    96,20       150    11/08/2006    87,85


    14             .      .        8410    11/08/2006    87,85


    15             .      .        3800    11/08/2006    87,85


    16             .      .         400    13/02/2007    97,80


    17             .      .         300    06/03/2007    89,67


    18             .      .       40000    06/03/2007    89,67


   */

Ksharp
Super User
Sorry.The code above has some problem.
The following code is corrected version.
P.S. The code is recommended for the sake of your partial data.



[pre]
data i;
input tdate : DDMMYY10. fund : $10. ISIN : $20. Price : commaX8. transmod $ numberstocks;
format tdate DDMMYY10. price commaX8.2;
datalines;
19/02/2004 100174701 BE0003565737 47 buy 3000
19/03/2004 100174701 BE0003565737 44,8 buy 3800
24/03/2004 100174701 BE0003565737 44,59 buy 12050
23/06/2005 100174701 BE0003565737 66,65 buy 3900
11/11/2005 100174701 BE0003565737 73,15 buy 3000
14/11/2005 100174701 BE0003565737 73,2 buy 3800
17/05/2006 100174701 BE0003565737 84,3 buy 250
12/06/2006 100174701 BE0003565737 78,05 buy 800
11/08/2003 100174701 BE0003565737 87,85 sell 40000
11/08/2006 100174701 BE0003565737 87,85 sell 3800
21/09/2006 100174701 BE0003565737 84,75 buy 530
23/11/2006 100174701 BE0003565737 85,3 buy 30
4/12/2006 100174701 BE0003565737 85,95 buy 20
23/01/2007 100174701 BE0003565737 97,65 buy 260
13/02/2007 100174701 BE0003565737 97,8 sell 400
23/02/2007 100174701 BE0003565737 96,2 buy 150
6/03/2007 100174701 BE0003565737 89,67 sell 300
6/03/2007 100174701 BE0003565737 89,67 sell 40000
;
run;
proc sort data=i ;
by tdate ;
run;
data i;
set i;
if _n_=1 then delete;
run;
data buy sell(rename=(tdate=_tdate fund=_fund isin=_isin price=_price transmod=_transmod numberstocks=_numberstocks));
set i;
if transmod eq 'buy' then output buy;
else if transmod eq 'sell' then output sell;
run;

data want;
b=1;s=1;
set buy nobs=b_obs;
set sell nobs=s_obs;
sum+numberstocks; _sum+_numberstocks;b_flag=0;s_flag=1;
output;

do while(b lt b_obs or s lt s_obs);

do while(sum lt _sum);
b+1;
set buy point=b;
sum+numberstocks;b_flag=0; s_flag=1;
output;
if b ge b_obs and s ge s_obs then stop;
end;

do while(sum gt _sum);
s+1;
set sell point=s;
_sum+_numberstocks;b_flag=1; s_flag=0;
output;
if b ge b_obs and s ge s_obs then stop;
end;

end;

run;


data op;
set want;
dif_sum=_sum - sum;
run;

data opp;
set op;
lag_dif_sum=abs(lag(dif_sum));
drop sum _sum dif_sum ;
run;

data temp;
set opp;
if _n_ eq 1 then _numberstocks = numberstocks;
else do;
if b_flag =0 and s_flag=1 then _numberstocks=lag_dif_sum;
else if b_flag=1 and s_flag=0 then numberstocks=lag_dif_sum;
end;
run;

data result;
set temp;
if b_flag=1 and s_flag=0 and _numberstocks gt lag_dif_sum then _numberstocks =numberstocks;
drop b_flag s_flag lag_dif_sum;
run;
proc print noobs;run;

[/pre]





Ksharp

Message was edited by: Ksharp

Message was edited by: Ksharp
OzoneX15
Calcite | Level 5
KSharp,
Can I have your mail adress? Because we can't send private messages in this forum?

Kind regards

Stefaan_pauwels@hotmail.com
Ksharp
Super User
Sure.Of course. E-mail: xiakeshan@yahoo.com.cn
But I am in china now.
And maybe you need some more time to consider
for your whole data.
I can explain my code if you has some uncertainty.
Try it at your best.
So I think you can do it by yourself.


Good Luck.

[pre]
19/03/2004 100174701 BE0003565737 44,8 buy 2300 6/03/2007 100174701 BE0003565737 89,67 sell 2300
24/03/2004 100174701 BE0003565737 44,59 buy 12050 6/03/2007 100174701 BE0003565737 89,67 sell 37700
23/06/2005 100174701 BE0003565737 66,65 buy 3900 6/03/2007 100174701 BE0003565737 89,67 sell 33800
[/pre]
33800=37700-3900
not
37700-12050
Is it rigth?

Message was edited by: Ksharp

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1368 views
  • 0 likes
  • 4 in conversation