Set A | Set A | Set A | Set B | Set B | Set C | Set C | Set C | Set C | ||
Permno | Date | Rtrn | Permno | date | Permno | Date | Rtrn | event_date | ||
11061 | 20110531 | 0.115225 | 10909 | 20120701 | 11061 | 20110531 | 0.115225 | -2 | ||
11061 | 20120531 | 0.101424 | 11061 | 20130820 | 11061 | 20120531 | 0.101424 | -1 | ||
11061 | 20130531 | 0.114746 | 11264 | 20130402 | 11061 | 20130531 | 0.114746 | 0 | ||
11061 | 20140531 | 0.11639 | 11308 | 20120813 | 11061 | 20140531 | 0.11639 | 1 | ||
11061 | 20150531 | 0.055248 | 12054 | 20130701 | 11061 | 20150531 | 0.055248 | 2 | ||
11061 | 20160531 | 0.094931 | 12149 | 20130809 | 11061 | 20160531 | 0.094931 | 3 | ||
12169 | 20111231 | 0.029143 | 12169 | 20130701 | 12169 | 20111231 | 0.029143 | -2 | ||
12169 | 20121231 | 0.060995 | 12357 | 20120201 | 12169 | 20121231 | 0.060995 | -1 | ||
12169 | 20131231 | 0.069918 | 12370 | 20120516 | 12169 | 20131231 | 0.069918 | 0 | ||
12169 | 20141231 | 0.150442 | 12169 | 20141231 | 0.150442 | 1 | ||||
12169 | 20151231 | 0.183641 | 12169 | 20151231 | 0.183641 | 2 | ||||
12169 | 20161231 | 0.152767 | 12169 | 20161231 | 0.152767 | 3 | ||||
12370 | 20111231 | 0.173508 | 12370 | 20111231 | 0.173508 | -1 | ||||
12370 | 20121231 | 0.190566 | 12370 | 20121231 | 0.190566 | 0 | ||||
12370 | 20131231 | 0.073573 | 12370 | 20131231 | 0.073573 | 1 | ||||
12370 | 20141231 | 0.085915 | 12370 | 20141231 | 0.085915 | 2 | ||||
12370 | 20151231 | 0.075433 | 12370 | 20151231 | 0.075433 | 3 | ||||
12370 | 20161231 | 0.112746 | 12370 | 20161231 | 0.112746 | 4 |
I need help with the above data. I have two different files, Set A and Set B. I want to merge these two files as in Set C.
Set B shows some event dates. In Set C, those years will be zero and after the event year there will be positive serial numbers and before that negative serial numbers. The files need to be merged based on permno and date. If you need you can only keep year(not month and day) in date variables.
Simple with a SQL join and SAS date interval functions:
data a;
input permno dateA :yymmdd8. rtm;
format dateA yymmdd10.;
datalines;
11061 20110531 0.115225
11061 20120531 0.101424
11061 20130531 0.114746
11061 20140531 0.11639
11061 20150531 0.055248
11061 20160531 0.094931
12169 20111231 0.029143
12169 20121231 0.060995
12169 20131231 0.069918
12169 20141231 0.150442
12169 20151231 0.183641
12169 20161231 0.152767
12370 20111231 0.173508
12370 20121231 0.190566
12370 20131231 0.073573
12370 20141231 0.085915
12370 20151231 0.075433
12370 20161231 0.112746
;
data b;
input permno dateB :yymmdd8.;
format dateB yymmdd10.;
datalines;
10909 20120701
11061 20130820
11264 20130402
11308 20120813
12054 20130701
12149 20130809
12169 20130701
12357 20120201
12370 20120516
;
proc sql;
create table c as
select
a.*,
intck("YEAR", b.dateB, a.dateA) as event_date
from a left join b on a.permNo=b.permNo
order by permNo, dateA;
select * from c;
quit;
Assuming both sets are sorted, you need first to merge the data and then compute the event_date:
data set_C;
merge set_A (in=ina)
set_B (in=inb)
;
by permno date;
reatin event_date;
if first.permno then even_date = .;
if ina; /* ignore set_B if no match to set_A */
if ina and inb then event_date = 0;
else event_date + 1; /* positive sequence */
run;
proc sort data=set_C; by perno descending date; run
data set_C;
set set_C;
retain event_dt;
if event_date ne . then evebt_dt = event_date;
else do;
event_dt -1;
event_date = event_dt; /* negative sequence */
end;
run;
proc sort data=set_C; by perno date; run
Simple with a SQL join and SAS date interval functions:
data a;
input permno dateA :yymmdd8. rtm;
format dateA yymmdd10.;
datalines;
11061 20110531 0.115225
11061 20120531 0.101424
11061 20130531 0.114746
11061 20140531 0.11639
11061 20150531 0.055248
11061 20160531 0.094931
12169 20111231 0.029143
12169 20121231 0.060995
12169 20131231 0.069918
12169 20141231 0.150442
12169 20151231 0.183641
12169 20161231 0.152767
12370 20111231 0.173508
12370 20121231 0.190566
12370 20131231 0.073573
12370 20141231 0.085915
12370 20151231 0.075433
12370 20161231 0.112746
;
data b;
input permno dateB :yymmdd8.;
format dateB yymmdd10.;
datalines;
10909 20120701
11061 20130820
11264 20130402
11308 20120813
12054 20130701
12149 20130809
12169 20130701
12357 20120201
12370 20120516
;
proc sql;
create table c as
select
a.*,
intck("YEAR", b.dateB, a.dateA) as event_date
from a left join b on a.permNo=b.permNo
order by permNo, dateA;
select * from c;
quit;
Hi PG,
I need a little more help.
I have a huge set of data for both SET A and B. So, I will not be able to use datalines.
data a;
input permno dateA :yymmdd8. rtm;
format dateA yymmdd10.;
datalines;
Without using this dataline how will I change my date in SET A?
Datalines is just for demonstration. With SAS your data can be just about anywhere. Where is your data.
Assuming your event file is a text file, you should use an INFILE statement to tell SAS where your data is.
In that case, use proc import to bring in your data as a SAS dataset. Read the documentation for proc import.
Once you have the data in SAS datasets, take the code above (the SQL part) and substitute your dataset names and variable names.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.