BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
KS99
Obsidian | Level 7

Hi, all, 

Greetings, and nice to meet you again. 

 

I am trying to merge two datasets. One is a very large mega data of US daily stock returns. The other is a much smaller data containing only important event days for firms. 

What I am testing is how a firm's single-day events impact her post-event stock returns on a daily basis. So, by this merging, I am attaching a dummy variable indicating event days to the mega data of daily stock returns. 

 

I have two following datasets: 

(1) US daily stock returns

CUSIPDATE
000021308/17/1990
000021308/20/1990
000021308/21/1990
000021308/22/1990
000021308/23/1990
000021308/24/1990
000021308/27/1990
….........
000209103/1/1988
000209103/2/1988
000209103/5/1988
000209103/6/1988
….........
000209104/3/1989
000209104/4/1989
000209104/5/1989
000209104/6/1989
….........

 

(2) important event days

CUSIPdateEvent
000021308/23/19901
000021304/17/19921
000021307/18/19921
000021306/22/19941
000021309/21/19941
000021306/20/19951
0000213012/19/19961
000209103/3/19881
000209101/17/19891
000209104/5/19891
000209107/11/19901
0002091010/11/19951
….........

 

(3) And, what I want to obtain is as follows: 

CUSIPDATEEvent
000021308/17/1990 
000021308/20/1990 
000021308/21/1990 
000021308/22/1990 
000021308/23/19901
000021308/24/1990 
000021308/27/1990 
….........
000209103/1/1988 
000209103/2/1988 
000209103/5/19881
000209103/6/1988 
….........
000209104/3/1989 
000209104/4/1989 
000209104/5/19891
000209104/6/1989 
….........

 

From the two base datasets (1) and (2), most event days correspond to daily stock returns. 

However, there are cases in which, like Cusip 00020910's event on 3/3/1988, the event day does not have its corresponding day of stock returns. 

In this case, I want to attach the dummy's value = 1 to the very next observation from the daily stock returns, that is, 3/5/1988. Once again, since I am testing the post-event impact on stock returns, the event dummy should be attached to the NEXT closest observation (i.e., stock returns). 

 

I looked up SAS community posts to figure out how to do this. The best I could find was to use proc sql, by applying On a.cusip=b.cusip where b.date <= a.date group by a.cusip
having abs(a.date-b.date)=min(abs(a.date-b.date)) order by a.cusip, b.date; 

However, it didn't work. Please help! 

 

Sincerely, 

KS -, 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

A very small correction:

 

data want(drop = c d rc);
   
   if _N_ = 1 then do;
      dcl hash h(dataset : 'have2(rename = (cusip = c date = d)', ordered : 'Y');
      h.definekey('c', 'd');
	  h.definedone();
	  dcl hiter i('h');
   end;

   set have1;
   c = '        ';
   d = .;

   if h.check(key : cusip, key : date) = 0 then do;
      pd = date;
	  h.remove(key : cusip, key : date);
   end;

   else do;
      h.add(key : cusip, key : date, data : cusip, data : date);

      if i.setcur(key : cusip, key : date) = 0 then
         if i.prev() = 0 & cusip = c & date - d <= 5 then do;
            pd = d;
			rc = i.last();
			rc = i.next();
            h.remove(key : cusip, key : pd);
      end;

      h.remove(key : cusip, key : date);
   end;

   event = pd > .;

   format pd mmddyy10.;

run;

 

Result:

 

CUSIP     DATE        pd          event 
00002130  08/17/1990  .           0 
00002130  08/20/1990  .           0 
00002130  08/21/1990  .           0 
00002130  08/22/1990  .           0 
00002130  08/23/1990  08/23/1990  1 
00002130  08/24/1990  .           0 
00002130  08/27/1990  .           0 
00020910  03/01/1988  .           0 
00020910  03/02/1988  .           0 
00020910  03/05/1988  03/03/1988  1 
00020910  03/06/1988  .           0 
00020910  04/03/1989  .           0 
00020910  04/04/1989  .           0 
00020910  04/05/1989  04/05/1989  1 
00020910  04/06/1989  .           0 

 

 

View solution in original post

8 REPLIES 8
KS99
Obsidian | Level 7

It is daily stock returns of NYSE and other markets over 1986 to 2020. 

Has more than 40,000,000 obs. So, I divided them into four subsets to do the task in SAS. 

 

Thank you! 

PeterClemmensen
Tourmaline | Level 20

Do not divide your data into smaller portions.

 

Try something like this. Should scale well. 

 

Feel free to ask  🙂

 

data have1;
input CUSIP $ DATE : mmddyy10.;
format DATE mmddyy10.;
datalines;
00002130 8/17/1990 
00002130 8/20/1990 
00002130 8/21/1990 
00002130 8/22/1990 
00002130 8/23/1990 
00002130 8/24/1990 
00002130 8/27/1990 
00020910 3/1/1988  
00020910 3/2/1988  
00020910 3/5/1988  
00020910 3/6/1988  
00020910 4/3/1989  
00020910 4/4/1989  
00020910 4/5/1989  
00020910 4/6/1989  
;

data have2;
input CUSIP $ date :mmddyy10. Event;
format DATE mmddyy10.;
datalines;
00002130 8/23/1990  1 
00002130 4/17/1992  1 
00002130 7/18/1992  1 
00002130 6/22/1994  1 
00002130 9/21/1994  1 
00002130 6/20/1995  1 
00002130 12/19/1996 1 
00020910 3/3/1988   1 
00020910 1/17/1989  1 
00020910 4/5/1989   1 
00020910 7/11/1990  1 
00020910 10/11/1995 1 
;


data want(drop = c d rc);
   
   if _N_ = 1 then do;
      dcl hash h(dataset : 'have2(rename = (cusip = c date = d)', ordered : 'Y');
      h.definekey('c', 'd');
	  h.definedone();
	  dcl hiter i('h');
   end;

   set have1;
   c = '        ';
   d = .;

   if h.check(key : cusip, key : date) = 0 then do;
      pd = date;
	  h.remove(key : cusip, key : date);
   end;

   else do;
      h.add(key : cusip, key : date, data : cusip, data : date);

      if i.setcur(key : cusip, key : date) = 0 then
         if i.prev() = 0 & cusip = c then do;
            pd = d;
			rc = i.last();
			rc = i.next();
            h.remove(key : cusip, key : pd);
      end;

      h.remove(key : cusip, key : date);
   end;

   event = pd > .;

   format pd mmddyy10.;

run;

 

Result:

 

CUSIP     DATE        pd          event 
00002130  08/17/1990  .           0 
00002130  08/20/1990  .           0 
00002130  08/21/1990  .           0 
00002130  08/22/1990  .           0 
00002130  08/23/1990  08/23/1990  1 
00002130  08/24/1990  .           0 
00002130  08/27/1990  .           0 
00020910  03/01/1988  .           0 
00020910  03/02/1988  .           0 
00020910  03/05/1988  03/03/1988  1 
00020910  03/06/1988  .           0 
00020910  04/03/1989  01/17/1989  1 
00020910  04/04/1989  .           0 
00020910  04/05/1989  04/05/1989  1 
00020910  04/06/1989  .           0 

 

KS99
Obsidian | Level 7

Dear Peter, 

 

Wow! Thank you! It works wonderfully. 

By the way, Peter, I found my own fault in explaining my desired results. 

00020910  04/03/1989  01/17/1989  1 

 The results your codes produced have such a row. But I want the stock date and event date to be no more than, say, 5 days. What should I add to your codes? 

 

Many many thanks in advance!! 

 

Sincerely, 

KS Choi -, 

 

PeterClemmensen
Tourmaline | Level 20

A very small correction:

 

data want(drop = c d rc);
   
   if _N_ = 1 then do;
      dcl hash h(dataset : 'have2(rename = (cusip = c date = d)', ordered : 'Y');
      h.definekey('c', 'd');
	  h.definedone();
	  dcl hiter i('h');
   end;

   set have1;
   c = '        ';
   d = .;

   if h.check(key : cusip, key : date) = 0 then do;
      pd = date;
	  h.remove(key : cusip, key : date);
   end;

   else do;
      h.add(key : cusip, key : date, data : cusip, data : date);

      if i.setcur(key : cusip, key : date) = 0 then
         if i.prev() = 0 & cusip = c & date - d <= 5 then do;
            pd = d;
			rc = i.last();
			rc = i.next();
            h.remove(key : cusip, key : pd);
      end;

      h.remove(key : cusip, key : date);
   end;

   event = pd > .;

   format pd mmddyy10.;

run;

 

Result:

 

CUSIP     DATE        pd          event 
00002130  08/17/1990  .           0 
00002130  08/20/1990  .           0 
00002130  08/21/1990  .           0 
00002130  08/22/1990  .           0 
00002130  08/23/1990  08/23/1990  1 
00002130  08/24/1990  .           0 
00002130  08/27/1990  .           0 
00020910  03/01/1988  .           0 
00020910  03/02/1988  .           0 
00020910  03/05/1988  03/03/1988  1 
00020910  03/06/1988  .           0 
00020910  04/03/1989  .           0 
00020910  04/04/1989  .           0 
00020910  04/05/1989  04/05/1989  1 
00020910  04/06/1989  .           0 

 

 

KS99
Obsidian | Level 7

Oh, no worries, Peter. 

I figured it out. 

DATE - pd > 5 then delete or something something, 

 

Thank you! 

 

PeterClemmensen
Tourmaline | Level 20

Just posted a corrected version 🙂

KS99
Obsidian | Level 7

Hi, Peter, 

 

Thank you so much for your revised codes. 

It works perfectly not only for this case, but also for my subsequent creations. 

 

But soon again, I bumped into another problem. 

(I posted it on Community board.) I'd appreciate your help greatly. 

 

Sincerely, 

KS -, 

 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1151 views
  • 1 like
  • 2 in conversation