Selectively Adding Observations to Large Dataset

Reply
Frequent Contributor
Posts: 101

Selectively Adding Observations to Large Dataset

Hello All,

I have data that takes the following form:


FUNDNO               CUSIP                         QTR                           SHARES                    CHANGE          

001                         5                              2001Q1                        100                              100

001                          5                            2001Q2                        101                                1

001                           5                           2001Q3                       103                                   2

001                           5                          2001Q4                        103                                  .     

001                           5                            2005Q1                           10                              10

001                           5                            2005Q2                           10                                  .

001                           5                            2005Q3                           11                                   1

001                           5                            2005Q4                           11                                   .

As you can see, the data is telling me how many shares of a certain stock a fund holds at the end of a quarter (SHARES) and the change since the last period.  You can also see that, in 2001Q1 and 2005Q1, the fund initiated its position by buying all of the shares that they owned at the end of the quarter.  That is to say, from the data, prior to 2001Q1 and from 2002Q1 to 2005Q1, this fund did not own any of this stock.

This infers that, in 2002Q1, this fund sold all 103 shares of this stock, because the next time that this stock shows up, the change in shares is equal to the number of shares held.  Meaning that all of that stock was purchased in that quarter. 

The dataset, however, does not have any entries for when the fund hold none of the stock. So there is no entry for 2002Q1, when I want there to be an entry showing that the fund owns no stock and the change was -103.  I also want to retail all the rest of the data that does not change from quarter to quarter.  I need figure out how to say the following:

If the fund owned a stock at some time in the past, and there is a time when the fund has an equal SHARES and CHANGE value, then the fund must have sold all of this stock after their last report.

I have no idea how to do this, but this is really messing up my summary data becuase I am not caputuring any of the liquidation of positions, only the initiation of positions.

Let me know if you have any insight.

John

Super User
Posts: 11,343

Re: Selectively Adding Observations to Large Dataset

Posted in reply to mahler_ji

Post an example of your expected modified data.

Super User
Posts: 5,518

Re: Selectively Adding Observations to Large Dataset

Posted in reply to mahler_ji

I think you're asking to insert data for one additional quarter, rather than filling in the complete set of quarters.  Here's one approach:

data want;

   set have;

   by fund cusip quarter;

   output;

   prior_shares = lag(shares);

   prior_quarter = lag(qtr);

   if first.cusip=0 and shares=change then do;

      shares = 0;

      change = -prior_shares;

      * logic to re-set QTR:  ;

      prior_q = input(substr(prior_quarter,6,1),1.);

      if prior_q < 4 then do;

         prior_q = prior_q + 1;

         quarter = substr(prior_quarter,1,5) || put(prior_q,1.);

      end;

      else do;

         year = 1 + input(prior_quarter,4.);

         quarter = put(year, 4.) || 'Q1';

      end;

      output;

   end;

   drop prior_quarter prior_q prior_shares year;

run;

   

The assumption here is that QTR is a character variable, but if it is actually a SAS date the logic would become easier.

You will need to sort after this step, since the added records will not be in the proper sequence.

Good luck.

Frequent Contributor
Posts: 101

Re: Selectively Adding Observations to Large Dataset

Posted in reply to mahler_ji

Hello everyone,

Thank you for your responses.  , yes I am trying to only add the one quarter.  To answer question, here is the output that I want, with the bold as the added observation:

FUNDNO               CUSIP                         QTR                           SHARES                    CHANGE         

001                         5                              2001Q1                        100                              100

001                          5                            2001Q2                        101                                1

001                           5                           2001Q3                       103                                   2

001                           5                          2001Q4                        103                                  .   

001                         5                             2002Q1                         .                                   -103

001                           5                            2005Q1                           10                              10

001                           5                            2005Q2                           10                                  .

001                           5                            2005Q3                           11                                   1

001                           5                            2005Q4                           11                                   .

I want it to show that the fund MUST have sold all shares that quarter in order to have entered a new position in 2005.

Also, the dates are in SAS format (yyq.), so I think that will just change some of the subst logic.

I am going to give this a shot and see if I can't make it work.  Just a question, what is the first.cusip step saying?  What does it mean if first.cusip = 0?

Thanks for all of your help.

John

Super User
Posts: 5,518

Re: Selectively Adding Observations to Large Dataset

Posted in reply to mahler_ji

It seems you are asking about the meaning of:

if first.cusip=0 then do;

I can give you the brief explanation, but you will need to do some research/studying.  The topic is adding a BY statement to a DATA step, and creating BY VARIABLES.

This particular statement tells SAS to ignore the first record for each FUND/CUSIP combination.  It is expected that no adjustment should be made for that record, even though there will be a break in the sequential QTR values at that point.

Ask a Question
Discussion stats
  • 4 replies
  • 368 views
  • 3 likes
  • 3 in conversation