BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
abdulla
Pyrite | Level 9
Set ASet ASet A Set BSet B Set CSet CSet CSet C
PermnoDateRtrn Permnodate PermnoDateRtrnevent_date
11061201105310.115225 1090920120701 11061201105310.115225-2
11061201205310.101424 1106120130820 11061201205310.101424-1
11061201305310.114746 1126420130402 11061201305310.1147460
11061201405310.11639 1130820120813 11061201405310.116391
11061201505310.055248 1205420130701 11061201505310.0552482
11061201605310.094931 1214920130809 11061201605310.0949313
12169201112310.029143 1216920130701 12169201112310.029143-2
12169201212310.060995 1235720120201 12169201212310.060995-1
12169201312310.069918 1237020120516 12169201312310.0699180
12169201412310.150442    12169201412310.1504421
12169201512310.183641    12169201512310.1836412
12169201612310.152767    12169201612310.1527673
12370201112310.173508    12370201112310.173508-1
12370201212310.190566    12370201212310.1905660
12370201312310.073573    12370201312310.0735731
12370201412310.085915    12370201412310.0859152
12370201512310.075433    12370201512310.0754333
12370201612310.112746    12370201612310.1127464

 

I need help with the above data. I have two different files, Set A and Set B.  I want to merge these two files as in Set C. 

Set B shows some event dates. In Set C, those years will be zero and after the event year there will be positive serial numbers and before that negative serial numbers.  The files need to be merged based on permno and date. If you need you can only keep year(not month and day) in date variables. 

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Simple with a SQL join and SAS date interval functions:

 

data a;
input permno dateA :yymmdd8. rtm;
format dateA yymmdd10.;
datalines;
11061	20110531	0.115225
11061	20120531	0.101424
11061	20130531	0.114746
11061	20140531	0.11639	 
11061	20150531	0.055248
11061	20160531	0.094931
12169	20111231	0.029143
12169	20121231	0.060995
12169	20131231	0.069918
12169	20141231	0.150442
12169	20151231	0.183641
12169	20161231	0.152767
12370	20111231	0.173508
12370	20121231	0.190566
12370	20131231	0.073573
12370	20141231	0.085915
12370	20151231	0.075433
12370	20161231	0.112746
;

data b;
input permno dateB :yymmdd8.;
format dateB yymmdd10.;
datalines;
10909	20120701
11061	20130820
11264	20130402
11308	20120813	
12054	20130701
12149	20130809
12169	20130701
12357	20120201
12370	20120516
;

proc sql;
create table c as
select  
    a.*,
    intck("YEAR", b.dateB, a.dateA) as event_date
from a left join b on a.permNo=b.permNo
order by permNo, dateA;
select * from c;
quit;



PG

View solution in original post

13 REPLIES 13
Shmuel
Garnet | Level 18

Assuming both sets are sorted, you need first to merge the data and then compute the event_date:

data set_C;
  merge set_A  (in=ina)
        set_B  (in=inb)
   ;
     by permno date;
          reatin event_date;
          if first.permno then even_date = .;  

          if ina;  /* ignore set_B if no match to set_A */
          if ina and inb then  event_date = 0;
          else event_date + 1;   /* positive sequence */
run;

proc sort data=set_C; by perno descending date; run

data set_C;
  set set_C;
        retain event_dt;
        if event_date ne . then evebt_dt = event_date;
        else do;
            event_dt -1;
            event_date = event_dt;  /* negative sequence */
        end;
run;

proc sort data=set_C; by perno date; run

abdulla
Pyrite | Level 9
I tried the code. It is not what I am looking for. It is not the same as set C
PGStats
Opal | Level 21

Simple with a SQL join and SAS date interval functions:

 

data a;
input permno dateA :yymmdd8. rtm;
format dateA yymmdd10.;
datalines;
11061	20110531	0.115225
11061	20120531	0.101424
11061	20130531	0.114746
11061	20140531	0.11639	 
11061	20150531	0.055248
11061	20160531	0.094931
12169	20111231	0.029143
12169	20121231	0.060995
12169	20131231	0.069918
12169	20141231	0.150442
12169	20151231	0.183641
12169	20161231	0.152767
12370	20111231	0.173508
12370	20121231	0.190566
12370	20131231	0.073573
12370	20141231	0.085915
12370	20151231	0.075433
12370	20161231	0.112746
;

data b;
input permno dateB :yymmdd8.;
format dateB yymmdd10.;
datalines;
10909	20120701
11061	20130820
11264	20130402
11308	20120813	
12054	20130701
12149	20130809
12169	20130701
12357	20120201
12370	20120516
;

proc sql;
create table c as
select  
    a.*,
    intck("YEAR", b.dateB, a.dateA) as event_date
from a left join b on a.permNo=b.permNo
order by permNo, dateA;
select * from c;
quit;



PG
abdulla
Pyrite | Level 9
Really Awesome
abdulla
Pyrite | Level 9

Hi PG,

I need a little more help.
I have a huge set of data for both SET A and B. So, I will not be able to use datalines.
data a;
input permno dateA :yymmdd8. rtm;
format dateA yymmdd10.;
datalines;

Without using this dataline how will I change my date in SET A?

PGStats
Opal | Level 21

Datalines is just for demonstration. With SAS your data can be just about anywhere. Where is your data.

PG
abdulla
Pyrite | Level 9
My data is in the event file.
So I used

Data AAA;
set event;
input permno dateA :yymmdd8. rtm;
format dateA yymmdd10.;
run;

So I am making some silly mistakes here. Could you please tell me what that is? One thing I know is that I should not use input.
PGStats
Opal | Level 21

Assuming your event file is a text file, you should use an INFILE statement to tell SAS where your data is.

PG
abdulla
Pyrite | Level 9
I am sorry that it is not still working. My one is an excel file.
-----
23
ERROR 23-2: Invalid option name DATEA.
PGStats
Opal | Level 21

In that case, use proc import to bring in your data as a SAS dataset. Read the documentation for proc import.

PG
abdulla
Pyrite | Level 9
I have imported the data before. In my event file I have Permno, date, assets, inc. Now I used the code but it is showing error in dateA.

Data AAA;
set event;
infile permno dateA :yymmdd8. assets inc;
format dateA yymmdd10.;
run;
PGStats
Opal | Level 21

Once you have the data in SAS datasets, take the code above (the SQL part) and substitute your dataset names and variable names.

PG
abdulla
Pyrite | Level 9
I will use SQL part only when I create dateA and dateB. My problem is that I can't even create "dateA" and "dateB". So. I have to first format the date in my event file. That's where I am facing problem.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 1534 views
  • 1 like
  • 3 in conversation