Loop event time list and calculate mean in another data set according to the event time interval.

Reply
Contributor
Posts: 24

Loop event time list and calculate mean in another data set according to the event time interval.

I have a data set with event timestamp and cusip for the stock and I want to calculate the mean volume around the timestamp using crsp.dsf file.  I am thinking of loop every row in the first data set and use the timestamp in each row to match and calculate the mean volume in crsp for that company for a fixed time interval.

 

But it seems that do loop cant work with two data sets.  Anyone have any ideas?

 

 

SAS Moderator
Posts: 62

Re: Loop event time list and calculate mean in another data set according to the event time interval. [how to improve your question]

Hello @Sasadomo,


Your question requires more details before experts can help. Can you revise your question to include more information? 

 

Review this checklist:

  • Specify a meaningful subject line for your topic.  Avoid generic subjects like "need help," "SAS query," or "urgent."
  • When appropriate, provide sample data in text or DATA step format.  See this article for one method you can use.
  • If you're encountering an error in SAS, include the SAS log or a screenshot of the error condition. Use the Photos button to include the image in your message.
    use_buttons.png
  • It also helps to include an example (table or picture) of the result that you're trying to achieve.

To edit your original message, select the "blue gear" icon at the top of the message and select Edit Message.  From there you can adjust the title and add more details to the body of the message.  Or, simply reply to this message with any additional information you can supply.

 

edit_post.png

SAS experts are eager to help -- help them by providing as much detail as you can.

 

This prewritten response was triggered for you by fellow SAS Support Communities member @draycut

.
Contributor
Posts: 24

Re: Loop event time list and calculate mean in another data set according to the event time interval

 

In the second level of the loop, I would like to perform the following loop for every event date stamp:

 

data meanvol;
do until (last.cusip); 
	set work.forms41;
	by cusip; 
	if first.cusip then call missing(total,n);
	if (secd-45) <= date <= (secd+45) then 
	do;
	total=sum(total,vol); 
	n=sum(n,1); 
	end;
end;
avgvoldsf=total/n;
run;

 

Contributor
Posts: 24

Re: Loop event time list and calculate mean in another data set according to the event time interval

where secd is the date part of the event timestamp contained in work.forms41
Trusted Advisor
Posts: 1,346

Re: Loop event time list and calculate mean in another data set according to the event time interval

Show us what the 2 datasets look like.  Does your crsp dsf file have 8-digit CUSIP? (stock specific, not company specific), or does it use  permno as an id?  What are the relevant variable names in BOTH of the data sets of interest?  Help us help you.

Contributor
Posts: 24

Re: Loop event time list and calculate mean in another data set according to the event time interval

Thank you for your reply.

 

Yes I have the 8-digit cusip in both data sets. The following is the what the data sets looks like: This is the first data set with the event timeImage_20180403151751.png

 And this is the second data set with the the daily volume. 

Image_20180403151847.png

What I attempted to do is loop every row in the first data set and for each row, match company according  to the cusip in the second data set and calculate 90 days (45 days before and 45 days after ) mean for this company around the event day.

Trusted Advisor
Posts: 1,346

Re: Loop event time list and calculate mean in another data set according to the event time interval

Your crsp.dsf dataset is almost certainly sorted by permno, not by cusip, so you either have to sort it by cusip/date (expensive given the size of the crsp.dsf dataset), or sort your form41 data set by permno.  But that dataset probably doesn't have permno, correct?  You might want to get permno's for your form41 dataset.

 

Also, before I suggest an approach, I see you have multiple events per cusip, and those event are often less the 45 days from each other.  So do you want 90 records for EACH event? 

 

Or if a sequence of events have no gaps larger the 45 days, do you want a single group of records starting 45 days before the earliest event in the sequence through 45 days after the latest event in the sequence.

 

Finally, are your CUSIPs in FORM41 the latest cusip for a given stock? Or are they merely contemporaneous with the SECD data value?  It's possible that a company has merged or split since SECD, in which case CUSIPs change, but CRSP continues the permno for the surviving company.

 

 

 

 

 

Contributor
Posts: 24

Re: Loop event time list and calculate mean in another data set according to the event time interval

- Yes crsp.dsf is currently ordered by permno and I will order it by cusip if I am using cusip.

-For accuracy, I would prefer 90 records for each event (and the event dates are pretty disperse so making 15 days groups might be unnecessary)

-they are the latest cusip form the master table. I thought of using gvkey which is unique but it shouldn't be big issue if there's change or split as I cut off the events that are not consistent of the first filling date and last filling date.

-
Contributor
Posts: 24

Re: Loop event time list and calculate mean in another data set according to the event time interval

For example, for first row in the first data set, there's a event on 2002/06/06 at 10:28:44 for company 00036110. I would like to match company 00036110 in the second data set and find find the 90 days mean for vol(in the second data set) around event date 2002/06/06. And repeat this process for every row in the first data set.

Super User
Super User
Posts: 9,604

Re: Loop event time list and calculate mean in another data set according to the event time interval

Post example test data in the form of a datastep.  You can follow the post at:

https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

 

Then show what the output should be from that test data.  It looks like your doing a simple sum() and count() based on a group of date between two groups.  It may therefore be easier to do:

data meanvol;
  set work.forms41;
  by cusip;
retain total n; if first.cusip then call missing(total,n); if (secd-45) <= date <= (secd+45) then do; total=sum(total,vol); n=sum(n,1);
end; if last.cusip then do;
avgvoldsf=total/n;
output;
end; run;
Contributor
Posts: 24

Re: Loop event time list and calculate mean in another data set according to the event time interval

May I ask why it would be easier to use the code provided? I think they will produce the same result.

Ask a Question
Discussion stats
  • 10 replies
  • 199 views
  • 0 likes
  • 4 in conversation