DATA Step, Macro, Functions and more

Finding a stock within fund holdings in previous quarter

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 15
Accepted Solution

Finding a stock within fund holdings in previous quarter

Hello all,

I would appreciate your help because I am a bit stuck in the sas programming. I have a data that involves the holdings composition of different funds (fund_id) at each quarter (stock_id , number and  prices) 

For each fund_id :

I would like to know for each stock held by a fund_id  at each quarter , if the stock existed on a previous quarter or not (i.e. if the fund held the same stock in a previous quarter). If it is the case ( i.e. if the stock existed on a previous quarter for the same fund), I should recuperate the number and price of that stock  in the previous quarter and creates for them new variables. If it's not the case  I do nothing or I put zeros for the newly created variables.

 

my variables are as follow ( what I have):

 

datefund_idstock_idnumberprice
3/31/2005A891610850045
3/31/2005A2264061062000013.73
3/31/2005A25243Q205900056.9
3/31/2005A2925051041500070.42
3/31/2005A633067103400043.305
3/31/2005A7672041001500129.75
3/31/2005A87403910070032
3/31/2005A545895490085
3/31/2005A8740391001000100
6/30/2005A25243Q2052000120
6/30/2005A292505104500150
6/30/2005A8740364540012
6/30/2005A645646300011
6/30/2005A5454545120013
3/31/2005B633067103500043.305
3/31/2005B7672041001500129.75
3/31/2005B87403910070015
3/31/2005B7777890020
3/31/2005B121212100040
3/31/2005B131313200088
6/30/2005B44456320045
6/30/2005B7672041001000036
6/30/2005B8740391003333100
6/30/2005B4666662014

 

I posted an excel sample data

 

What I want :

datefund_idstock_idnumberpricenumber1price1
3/31/2005A89161085004500
3/31/2005A2264061062000013.7300
3/31/2005A25243Q205900056.900
3/31/2005A2925051041500070.4200
3/31/2005A633067103400043.30500
3/31/2005A7672041001500129.7500
3/31/2005A8740391007003200
3/31/2005A54589549008500
3/31/2005A874039100100010000
6/30/2005A25243Q2052000120900056.9
6/30/2005A2925051045001501500070.42
6/30/2005A874036454001200
6/30/2005A64564630001100
6/30/2005A545454512001300
3/31/2005B633067103500043.30500
3/31/2005B7672041001500129.7500
3/31/2005B8740391007001500
3/31/2005B777789002000
3/31/2005B12121210004000
3/31/2005B1313132000880

0

 

 

 

Thank you in advance for your precious help.


Accepted Solutions
Solution
‎02-28-2018 06:16 AM
Trusted Advisor
Posts: 1,346

Re: Finding a stock within fund holdings in previous quarter

[ Edited ]

Your data is sorted by date/fundid/stockid (edited change: it's sorted by date/fundid, but not stockid within date/fundid.  However that is not required for my suggested program to work.  So don't think you need to sort the data any further than its current status).

 

If you have a large dataset of holdings, I would NOT sort by fundid/stock/date  (only to re-sort back to original order).  Instead you can use a hash object to dynamically hold the most recent date, price, and number of shares for each fund/stock combination.  This code is untested:

 

data want (drop=_:);
  if _n_=1 then do;
    if 0 then set have 
                  have (rename=(date=_date1 number=number1 price=price1));
    declare hash h ();
      h.definekey('fund_id','stock_id');
      h.definedata('_date1','number1','price1');
      h.definedone();
  end;

  retain _last_qtr_date .;

  set have;
  by date ;
  if first.date then _last_qtr_date=intnx('qtr',date,-1,'end');

  _rc=h.find();
  if NOT(_rc=0 and _date1=_last_qtr_date) then do;
    number1=0;
    price1=.;
  end;

_rc=h.replace(key:fund_id,key:stock_id,data:date,data:number,data:price); run;

 

Edited at 2:56PM,  just now put in the essential  _rc=h.replace ....    statement.

 

Edited at 4:45PM.  I think you will find this much faster than the PROC SQL, because the latter requires a Cartesian comparison of all dates within each fund/stock group.  The advantage of PROC SQL is that it doesn't require pre-sorted data sets.  But that also means it often will not take advantage of data that are pre-sorted.

 

View solution in original post


All Replies
Super User
Posts: 23,778

Re: Finding a stock within fund holdings in previous quarter

1. Sort by STOCK, date

2. Use LAG to retreive the previous value and/or date.

3. Check if the date is previous quarter, if so, flag as in previous quarter

4. Otherwise, set flag to 0. 

 

Spoiler

bera00 wrote:

Hello all,

I would appreciate your help because I am a bit stuck in the sas programming. I have a data that involves the holdings composition of different funds (fund_id) at each quarter (stock_id , number and  prices) 

For each fund_id :

I would like to know for each stock held by a fund_id  at each quarter , if the stock existed on a previous quarter or not (i.e. if the fund held the same stock in a previous quarter). If it is the case ( i.e. if the stock existed on a previous quarter for the same fund), I should recuperate the number and price of that stock  in the previous quarter and creates for them new variables. If it's not the case  I do nothing or I put zeros for the newly created variables.

 

my variables are as follow ( what I have):

 

date fund_id stock_id number price
3/31/2005 A 8916108 500 45
3/31/2005 A 226406106 20000 13.73
3/31/2005 A 25243Q205 9000 56.9
3/31/2005 A 292505104 15000 70.42
3/31/2005 A 633067103 4000 43.305
3/31/2005 A 767204100 1500 129.75
3/31/2005 A 874039100 700 32
3/31/2005 A 5458954 900 85
3/31/2005 A 874039100 1000 100
6/30/2005 A 25243Q205 2000 120
6/30/2005 A 292505104 500 150
6/30/2005 A 87403645 400 12
6/30/2005 A 645646 3000 11
6/30/2005 A 5454545 1200 13
3/31/2005 B 633067103 5000 43.305
3/31/2005 B 767204100 1500 129.75
3/31/2005 B 874039100 700 15
3/31/2005 B 77778 900 20
3/31/2005 B 121212 1000 40
3/31/2005 B 131313 2000 88
6/30/2005 B 44456 3200 45
6/30/2005 B 767204100 10000 36
6/30/2005 B 874039100 3333 100
6/30/2005 B 466666 20 14

 

I posted an excel sample data

 

What I want :

date fund_id stock_id number price number1 price1
3/31/2005 A 8916108 500 45 0 0
3/31/2005 A 226406106 20000 13.73 0 0
3/31/2005 A 25243Q205 9000 56.9 0 0
3/31/2005 A 292505104 15000 70.42 0 0
3/31/2005 A 633067103 4000 43.305 0 0
3/31/2005 A 767204100 1500 129.75 0 0
3/31/2005 A 874039100 700 32 0 0
3/31/2005 A 5458954 900 85 0 0
3/31/2005 A 874039100 1000 100 0 0
6/30/2005 A 25243Q205 2000 120 9000 56.9
6/30/2005 A 292505104 500 150 15000 70.42
6/30/2005 A 87403645 400 12 0 0
6/30/2005 A 645646 3000 11 0 0
6/30/2005 A 5454545 1200 13 0 0
3/31/2005 B 633067103 5000 43.305 0 0
3/31/2005 B 767204100 1500 129.75 0 0
3/31/2005 B 874039100 700 15 0 0
3/31/2005 B 77778 900 20 0 0
3/31/2005 B 121212 1000 40 0 0
3/31/2005 B 131313 2000 88 0

0

 

 

 

Thank you in advance for your precious help.


 

SAS Super FREQ
Posts: 825

Re: Finding a stock within fund holdings in previous quarter

You can use Proc SQL and a self referencing join, where you compute the date of the previous quarter using the INTNX function.

Based on the information returned, you can fill the new variables with values from the previous quarter.

 

See here an example:

data have;
  infile cards dlm=",";
  seqNr + 1;
  input
    date : mmddyy10.
    fund_id	: $8.
    stock_id : $16.
    number : 8.
    price :8.
  ;
  format date date9.;
cards;
3/31/2005,A,8916108,500,45
3/31/2005,A,226406106,20000,13.73
3/31/2005,A,25243Q205,9000,56.9
3/31/2005,A,292505104,15000,70.42
3/31/2005,A,633067103,4000,43.305
3/31/2005,A,767204100,1500,129.75
3/31/2005,A,874039100,700,32
3/31/2005,A,5458954,900,85
3/31/2005,A,874039100,1000,100
6/30/2005,A,25243Q205,2000,120
6/30/2005,A,292505104,500,150
6/30/2005,A,87403645,400,12
6/30/2005,A,645646,3000,11
6/30/2005,A,5454545,1200,13
3/31/2005,B,633067103,5000,43.305
3/31/2005,B,767204100,1500,129.75
3/31/2005,B,874039100,700,15
3/31/2005,B,77778,900,20
3/31/2005,B,121212,1000,40
3/31/2005,B,131313,2000,88
6/30/2005,B,44456,3200,45
6/30/2005,B,767204100,10000,36
6/30/2005,B,874039100,3333,100
6/30/2005,B,466666,20,14
;

proc sql feedback;
  select
    A.seqNr
    , A.date
    , A.fund_id
    , A.stock_id
    , A.number
    , A.price
    , B.seqNr
    , B.date
    , B.fund_id
    , B.stock_id
    , B.number
    , B.price
    , case
        when b.stock_id not is missing then b.number else 0
      end as previousNumner
    , case
        when b.stock_id not is missing then b.price else 0
      end as previousprice

  from
    have as a
    left outer join
    have as b
    on
      a.fund_id = b.fund_ID
      and a.stock_id = b.stock_id
      and b.date = intnx("qtr", a.date, -1, "E")
  order by
    a.date
    , a.fund_id
    , a.stock_id
  ;
quit;
    
Occasional Contributor
Posts: 15

Re: Finding a stock within fund holdings in previous quarter

Posted in reply to Bruno_SAS

Thank you for your resonse. this method works

Solution
‎02-28-2018 06:16 AM
Trusted Advisor
Posts: 1,346

Re: Finding a stock within fund holdings in previous quarter

[ Edited ]

Your data is sorted by date/fundid/stockid (edited change: it's sorted by date/fundid, but not stockid within date/fundid.  However that is not required for my suggested program to work.  So don't think you need to sort the data any further than its current status).

 

If you have a large dataset of holdings, I would NOT sort by fundid/stock/date  (only to re-sort back to original order).  Instead you can use a hash object to dynamically hold the most recent date, price, and number of shares for each fund/stock combination.  This code is untested:

 

data want (drop=_:);
  if _n_=1 then do;
    if 0 then set have 
                  have (rename=(date=_date1 number=number1 price=price1));
    declare hash h ();
      h.definekey('fund_id','stock_id');
      h.definedata('_date1','number1','price1');
      h.definedone();
  end;

  retain _last_qtr_date .;

  set have;
  by date ;
  if first.date then _last_qtr_date=intnx('qtr',date,-1,'end');

  _rc=h.find();
  if NOT(_rc=0 and _date1=_last_qtr_date) then do;
    number1=0;
    price1=.;
  end;

_rc=h.replace(key:fund_id,key:stock_id,data:date,data:number,data:price); run;

 

Edited at 2:56PM,  just now put in the essential  _rc=h.replace ....    statement.

 

Edited at 4:45PM.  I think you will find this much faster than the PROC SQL, because the latter requires a Cartesian comparison of all dates within each fund/stock group.  The advantage of PROC SQL is that it doesn't require pre-sorted data sets.  But that also means it often will not take advantage of data that are pre-sorted.

 

Occasional Contributor
Posts: 15

Re: Finding a stock within fund holdings in previous quarter

Thank you so much

this method works for my data

PROC Star
Posts: 1,845

Re: Finding a stock within fund holdings in previous quarter

data have;
infile cards dlm=",";
input date : mmddyy10. fund_id	: $8. stock_id : $16.  number : 8.  price :8.;
format date date9.;
cards;
3/31/2005,A,8916108,500,45
3/31/2005,A,226406106,20000,13.73
3/31/2005,A,25243Q205,9000,56.9
3/31/2005,A,292505104,15000,70.42
3/31/2005,A,633067103,4000,43.305
3/31/2005,A,767204100,1500,129.75
3/31/2005,A,874039100,700,32
3/31/2005,A,5458954,900,85
3/31/2005,A,874039100,1000,100
6/30/2005,A,25243Q205,2000,120
6/30/2005,A,292505104,500,150
6/30/2005,A,87403645,400,12
6/30/2005,A,645646,3000,11
6/30/2005,A,5454545,1200,13
3/31/2005,B,633067103,5000,43.305
3/31/2005,B,767204100,1500,129.75
3/31/2005,B,874039100,700,15
3/31/2005,B,77778,900,20
3/31/2005,B,121212,1000,40
3/31/2005,B,131313,2000,88
6/30/2005,B,44456,3200,45
6/30/2005,B,767204100,10000,36
6/30/2005,B,874039100,3333,100
6/30/2005,B,466666,20,14
;

proc sort data=have out=_have;
by fund_id stock_id date;
run;

data temp;
drop _:;
do until(last.stock_id);
set _have;
by fund_id stock_id date;
if first.stock_id then do;
number1=0;
price1=0;
end;
else if intnx('qtr',date,-1,'end')=_date1 then  do;
number1=_number1;
price1=_price1;
end;
_number1=number;
_price1=price;
_date1=date;
end;
run;

data want;
merge _have temp;
by fund_id stock_id date;
if missing(number1) and missing(price1) then do;
number1=0;
price1=0;
end;
run;

proc sort data=want out=final_want;
by fund_id date;
run;
Trusted Advisor
Posts: 1,346

Re: Finding a stock within fund holdings in previous quarter

Posted in reply to novinosrin

Most mutual fund financial research projects I have seen cover 1,000's of funds, with 1,000's of stocks over 10-year plus time periods (40 quarters).  A  solution like this with multiple sorts is expensive in time, disk space, and disk input/output activity.  It has four steps which read the full data set 4 times, writes a full data set 3 times, and invokes a proc sort 2 times. 

 

Given that the original sort order of the data set can be utilized as is, I find it hard to imagine a situation in which this solution should be suggested for a production environment.

PROC Star
Posts: 1,845

Re: Finding a stock within fund holdings in previous quarter

[ Edited ]
data have;
infile cards dlm=",";
input date : mmddyy10. fund_id	: $8. stock_id : $16.  number : 8.  price :8.;
format date date9.;
cards;
3/31/2005,A,8916108,500,45
3/31/2005,A,226406106,20000,13.73
3/31/2005,A,25243Q205,9000,56.9
3/31/2005,A,292505104,15000,70.42
3/31/2005,A,633067103,4000,43.305
3/31/2005,A,767204100,1500,129.75
3/31/2005,A,874039100,700,32
3/31/2005,A,5458954,900,85
3/31/2005,A,874039100,1000,100
6/30/2005,A,25243Q205,2000,120
6/30/2005,A,292505104,500,150
6/30/2005,A,87403645,400,12
6/30/2005,A,645646,3000,11
6/30/2005,A,5454545,1200,13
3/31/2005,B,633067103,5000,43.305
3/31/2005,B,767204100,1500,129.75
3/31/2005,B,874039100,700,15
3/31/2005,B,77778,900,20
3/31/2005,B,121212,1000,40
3/31/2005,B,131313,2000,88
6/30/2005,B,44456,3200,45
6/30/2005,B,767204100,10000,36
6/30/2005,B,874039100,3333,100
6/30/2005,B,466666,20,14
;


data want;
_n=0;
do n=1 by 1 until(last.fund_id);
set have;
by fund_id date;
number1=0;price1=0;
if first.date then _n+1;
array _t(25) $20;/*arbitrary subscript value*/
array _k(25,3);
if n=1 then call missing(of _t(*), of _k(*));
if _n>1 then do;
_f=whichc(stock_id,of _t(*));
	if _f>0 then do;
	if intnx('qtr',date,-1,'end')=_k(_f,3) then do;
	number1=_k(_f,1);
	price1=_k(_f,2);
	end;
end;
end;
_t(n)=stock_id;
_k(n,1)=number;
_k(n,2)=price;
_k(n,3)=date;
output;
end;
drop _: n; 
run;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 264 views
  • 7 likes
  • 5 in conversation