BookmarkSubscribeRSS Feed
mahler_ji
Obsidian | Level 7

Hello All,

I have another quick question on working with time series data, specifically turning it into event-time data.  If you helped me on my last issue, then you have seen this data, but if you haven't, I have data that takes, generally, the following form.  I am also attaching an excerpt of the dataset:

Firm               Date                    Ret

A                    1/1/11                    .1

A                    1/2/11                    .01

A                    1/3/11                    .1

A                    1/4/11                    .2

A                    1/5/11                    .3

.

.

.

B                    1/1/11                    .05

B                    1/2/11                    .1

.

.

.

What I would like to do is, for each day, create a bunch of new variables that give the return for, let's say, the two days previous and two days after that day.  So, for 1/3/11, it would look something like this:

Firm                    Date                    Ret                    Ret_2               Ret_1               Ret1          Ret2

A                         1/3/11               .1                         .1                    .01                     .2              .3

and this would repeat for every day in the dataset. 

I have been trying to do this two different ways, using either the lag function, or by moving the date using the INTNX function and then doing full joins, but if I want to do more than two days surrounding the date, like 30 days on each side of the date, this is taking forever.

My problem is further exacerbated by the fact that I have only weekday data, since there is no trading data for the weekends.

Any help would be awesome!  You guys are the best!

John

3 REPLIES 3
art297
Opal | Level 21

Depends upon what data you really want. First, your data is missing more than weekdays. I don't know if the extra missing data for certain dates represent holidays or simply data that for whatever reason you don't have.

My suggested solution simply captures a range of ret values around consecutive records regardless of what the dates actually are. It also only outputs a record when you have, say, &range. days' data before and after the date:

%let range=30;

proc sort data=test1;

  by stock_symbol date;

run;

data want (keep=stock_symbol date ret_&range.-ret_1 ret ret1-ret&range.);

  set test1 (keep=stock_symbol date ret

             rename=(ret=_ret date=_date));

  by stock_symbol;

  retain counter .;

  format date date9.;

  array ret_stack {1:%eval(&range.*2+1)};

  array date_stack {1:%eval(&range.*2+1)};

  array ret_out {1:%eval(&range.*2+1)} ret_&range.-ret_1 ret ret1-ret&range.;

  retain ret_stack: date_stack:;

  if first.stock_symbol then do;

    call missing(of ret_stack{*});

    call missing(of date_stack{*});

    counter=0;

  end;

  counter+1;

  if counter le %eval(&range.*2+1) then do;

   ret_stack{counter} = _ret;

   date_stack{counter} = _date;

  end;

  else do;

    do i=1 to %eval(&range.*2);

      ret_stack{i}=ret_stack{i+1};

      date_stack{i}=date_stack{i+1};

    end;

    ret_stack{i} = _ret;

    date_stack{i} = _date;

  end;

  if counter ge %eval(&range.*2+1) then do;

    do i=1 to %eval(&range.*2+1);

      ret_out{i}=ret_stack{i};

      if i eq %eval(&range.+1) then date=date_stack{i};

    end;

    output;

  end;

run;

mkeintz
PROC Star

Are you looking for leads and lags determined by calendar days, or by trading days?  And can you show the program you are using?

You have daily trading data from CRSP, sorted by permno/date.  Getting any number of lagged values is trivial.  And getting any number of lead values isn't that difficult either.   And it shouldn't take that long to run - a single pass through the data.

I'll send along some code later, but confirm that you want trading days or calendar days.  In either case, there should be no need for the INTNX function.

Mark

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Ksharp
Super User
data have;
input Firm     $          Date  : mmddyy10.                 Ret     ;
format Date   mmddyy10. ;
cards;
A                    1/1/11                    .1
A                    1/2/11                    .01
A                    1/3/11                    .1
A                    1/4/11                    .2
A                    1/5/11                    .3
B                    1/1/11                    .05
B                    1/2/11                    .1
;
run;
proc sql;
create table want as
 select *,(select ret from have where firm=a.firm and date=a.date-2)  as prev_2 ,
          (select ret from have where firm=a.firm and date=a.date-1)  as prev_1 ,
            (select ret from have where firm=a.firm and date=a.date+1)  as next_1 ,
            (select ret from have where firm=a.firm and date=a.date+2)  as next_2 
  from have as a;
quit;

Xia Keshan

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 775 views
  • 0 likes
  • 4 in conversation