BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
KS99
Obsidian | Level 7

Hi, all, 

Greetings, and nice to meet you again. 

 

I am trying to merge two datasets. One is a very large mega data of US daily stock returns. The other is a much smaller data containing only important event days for firms. 

What I am testing is how a firm's single-day events impact her post-event stock returns on a daily basis. So, by this merging, I am attaching a dummy variable indicating event days to the mega data of daily stock returns. 

 

I have two following datasets: 

(1) US daily stock returns

CUSIPDATE
000021308/17/1990
000021308/20/1990
000021308/21/1990
000021308/22/1990
000021308/23/1990
000021308/24/1990
000021308/27/1990
….........
000209103/1/1988
000209103/2/1988
000209103/5/1988
000209103/6/1988
….........
000209104/3/1989
000209104/4/1989
000209104/5/1989
000209104/6/1989
….........

 

(2) important event days

CUSIPdateEvent
000021308/23/19901
000021304/17/19921
000021307/18/19921
000021306/22/19941
000021309/21/19941
000021306/20/19951
0000213012/19/19961
000209103/3/19881
000209101/17/19891
000209104/5/19891
000209107/11/19901
0002091010/11/19951
….........

 

(3) And, what I want to obtain is as follows: 

CUSIPDATEEvent
000021308/17/1990 
000021308/20/1990 
000021308/21/1990 
000021308/22/1990 
000021308/23/19901
000021308/24/1990 
000021308/27/1990 
….........
000209103/1/1988 
000209103/2/1988 
000209103/5/19881
000209103/6/1988 
….........
000209104/3/1989 
000209104/4/1989 
000209104/5/19891
000209104/6/1989 
….........

 

From the two base datasets (1) and (2), most event days correspond to daily stock returns. 

However, there are cases in which, like Cusip 00020910's event on 3/3/1988, the event day does not have its corresponding day of stock returns. 

In this case, I want to attach the dummy's value = 1 to the very next observation from the daily stock returns, that is, 3/5/1988. Once again, since I am testing the post-event impact on stock returns, the event dummy should be attached to the NEXT closest observation (i.e., stock returns). 

 

I looked up SAS community posts to figure out how to do this. The best I could find was to use proc sql, by applying On a.cusip=b.cusip where b.date <= a.date group by a.cusip
having abs(a.date-b.date)=min(abs(a.date-b.date)) order by a.cusip, b.date; 

However, it didn't work. Please help! 

 

Sincerely, 

KS -, 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

A very small correction:

 

data want(drop = c d rc);
   
   if _N_ = 1 then do;
      dcl hash h(dataset : 'have2(rename = (cusip = c date = d)', ordered : 'Y');
      h.definekey('c', 'd');
	  h.definedone();
	  dcl hiter i('h');
   end;

   set have1;
   c = '        ';
   d = .;

   if h.check(key : cusip, key : date) = 0 then do;
      pd = date;
	  h.remove(key : cusip, key : date);
   end;

   else do;
      h.add(key : cusip, key : date, data : cusip, data : date);

      if i.setcur(key : cusip, key : date) = 0 then
         if i.prev() = 0 & cusip = c & date - d <= 5 then do;
            pd = d;
			rc = i.last();
			rc = i.next();
            h.remove(key : cusip, key : pd);
      end;

      h.remove(key : cusip, key : date);
   end;

   event = pd > .;

   format pd mmddyy10.;

run;

 

Result:

 

CUSIP     DATE        pd          event 
00002130  08/17/1990  .           0 
00002130  08/20/1990  .           0 
00002130  08/21/1990  .           0 
00002130  08/22/1990  .           0 
00002130  08/23/1990  08/23/1990  1 
00002130  08/24/1990  .           0 
00002130  08/27/1990  .           0 
00020910  03/01/1988  .           0 
00020910  03/02/1988  .           0 
00020910  03/05/1988  03/03/1988  1 
00020910  03/06/1988  .           0 
00020910  04/03/1989  .           0 
00020910  04/04/1989  .           0 
00020910  04/05/1989  04/05/1989  1 
00020910  04/06/1989  .           0 

 

 

View solution in original post

8 REPLIES 8
KS99
Obsidian | Level 7

It is daily stock returns of NYSE and other markets over 1986 to 2020. 

Has more than 40,000,000 obs. So, I divided them into four subsets to do the task in SAS. 

 

Thank you! 

PeterClemmensen
Tourmaline | Level 20

Do not divide your data into smaller portions.

 

Try something like this. Should scale well. 

 

Feel free to ask  🙂

 

data have1;
input CUSIP $ DATE : mmddyy10.;
format DATE mmddyy10.;
datalines;
00002130 8/17/1990 
00002130 8/20/1990 
00002130 8/21/1990 
00002130 8/22/1990 
00002130 8/23/1990 
00002130 8/24/1990 
00002130 8/27/1990 
00020910 3/1/1988  
00020910 3/2/1988  
00020910 3/5/1988  
00020910 3/6/1988  
00020910 4/3/1989  
00020910 4/4/1989  
00020910 4/5/1989  
00020910 4/6/1989  
;

data have2;
input CUSIP $ date :mmddyy10. Event;
format DATE mmddyy10.;
datalines;
00002130 8/23/1990  1 
00002130 4/17/1992  1 
00002130 7/18/1992  1 
00002130 6/22/1994  1 
00002130 9/21/1994  1 
00002130 6/20/1995  1 
00002130 12/19/1996 1 
00020910 3/3/1988   1 
00020910 1/17/1989  1 
00020910 4/5/1989   1 
00020910 7/11/1990  1 
00020910 10/11/1995 1 
;


data want(drop = c d rc);
   
   if _N_ = 1 then do;
      dcl hash h(dataset : 'have2(rename = (cusip = c date = d)', ordered : 'Y');
      h.definekey('c', 'd');
	  h.definedone();
	  dcl hiter i('h');
   end;

   set have1;
   c = '        ';
   d = .;

   if h.check(key : cusip, key : date) = 0 then do;
      pd = date;
	  h.remove(key : cusip, key : date);
   end;

   else do;
      h.add(key : cusip, key : date, data : cusip, data : date);

      if i.setcur(key : cusip, key : date) = 0 then
         if i.prev() = 0 & cusip = c then do;
            pd = d;
			rc = i.last();
			rc = i.next();
            h.remove(key : cusip, key : pd);
      end;

      h.remove(key : cusip, key : date);
   end;

   event = pd > .;

   format pd mmddyy10.;

run;

 

Result:

 

CUSIP     DATE        pd          event 
00002130  08/17/1990  .           0 
00002130  08/20/1990  .           0 
00002130  08/21/1990  .           0 
00002130  08/22/1990  .           0 
00002130  08/23/1990  08/23/1990  1 
00002130  08/24/1990  .           0 
00002130  08/27/1990  .           0 
00020910  03/01/1988  .           0 
00020910  03/02/1988  .           0 
00020910  03/05/1988  03/03/1988  1 
00020910  03/06/1988  .           0 
00020910  04/03/1989  01/17/1989  1 
00020910  04/04/1989  .           0 
00020910  04/05/1989  04/05/1989  1 
00020910  04/06/1989  .           0 

 

KS99
Obsidian | Level 7

Dear Peter, 

 

Wow! Thank you! It works wonderfully. 

By the way, Peter, I found my own fault in explaining my desired results. 

00020910  04/03/1989  01/17/1989  1 

 The results your codes produced have such a row. But I want the stock date and event date to be no more than, say, 5 days. What should I add to your codes? 

 

Many many thanks in advance!! 

 

Sincerely, 

KS Choi -, 

 

PeterClemmensen
Tourmaline | Level 20

A very small correction:

 

data want(drop = c d rc);
   
   if _N_ = 1 then do;
      dcl hash h(dataset : 'have2(rename = (cusip = c date = d)', ordered : 'Y');
      h.definekey('c', 'd');
	  h.definedone();
	  dcl hiter i('h');
   end;

   set have1;
   c = '        ';
   d = .;

   if h.check(key : cusip, key : date) = 0 then do;
      pd = date;
	  h.remove(key : cusip, key : date);
   end;

   else do;
      h.add(key : cusip, key : date, data : cusip, data : date);

      if i.setcur(key : cusip, key : date) = 0 then
         if i.prev() = 0 & cusip = c & date - d <= 5 then do;
            pd = d;
			rc = i.last();
			rc = i.next();
            h.remove(key : cusip, key : pd);
      end;

      h.remove(key : cusip, key : date);
   end;

   event = pd > .;

   format pd mmddyy10.;

run;

 

Result:

 

CUSIP     DATE        pd          event 
00002130  08/17/1990  .           0 
00002130  08/20/1990  .           0 
00002130  08/21/1990  .           0 
00002130  08/22/1990  .           0 
00002130  08/23/1990  08/23/1990  1 
00002130  08/24/1990  .           0 
00002130  08/27/1990  .           0 
00020910  03/01/1988  .           0 
00020910  03/02/1988  .           0 
00020910  03/05/1988  03/03/1988  1 
00020910  03/06/1988  .           0 
00020910  04/03/1989  .           0 
00020910  04/04/1989  .           0 
00020910  04/05/1989  04/05/1989  1 
00020910  04/06/1989  .           0 

 

 

KS99
Obsidian | Level 7

Oh, no worries, Peter. 

I figured it out. 

DATE - pd > 5 then delete or something something, 

 

Thank you! 

 

PeterClemmensen
Tourmaline | Level 20

Just posted a corrected version 🙂

KS99
Obsidian | Level 7

Hi, Peter, 

 

Thank you so much for your revised codes. 

It works perfectly not only for this case, but also for my subsequent creations. 

 

But soon again, I bumped into another problem. 

(I posted it on Community board.) I'd appreciate your help greatly. 

 

Sincerely, 

KS -, 

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 510 views
  • 1 like
  • 2 in conversation