Hello,
I want to join this two tables
Table 1
id offer caldt
1 1997-02-01 1997-12-31
1 1999-02-01 1998-12-31
table 2
id mret caldt
1 0.09 1997-12-31
1 0.87 1998-01-31
...........
1 0.89 1998-12-31
1 0.34 1999-01-31
result table
id mret caldt offer
1 0.09 1997-12-31 1997-02-01
1 0.87 1998-01-31 1997-02-01
...........
1 0.89 1998-12-31 1999-02-01
1 0.34 1999-01-31 1999-02-01
So it's not always the last of the month. In this case, our lookup table best covers all days.
data table1;
input rank_no crsp_fundno mgr_dt :yymmdd10.;
format mgr_dt yymmddd10.;
cards;
1 53 1991-01-01
1 53 1999-01-01
2 105 1985-01-01
2 105 1985-06-01
;
run;
data table2;
input rank_no crsp_fundno caldt :yymmdd10. mret;
format caldt yymmddd10.;
cards;
1 53 1990-11-30 1.0459
1 53 1990-12-31 1.0250
1 53 1991-01-31 1.0435
1 53 1991-02-28 1.0499
1 53 1991-03-28 1.0078
1 53 1999-01-29 0.9810
1 53 1999-02-26 1.0002
1 53 1999-03-31 1.0238
2 105 1984-12-31 1.0238
2 105 1985-01-31 1.0238
2 105 1985-02-28 1.0238
2 105 1985-03-29 1.0238
2 105 1985-04-30 1.0238
2 105 1985-05-31 1.0238
2 105 1985-06-28 1.0238
2 105 1985-07-31 0.9990
;
run;
%let maxdate=%sysfunc(date());
data t1_int (keep=rank_no crsp_fundno oldmgrdt caldt rename=(oldmgrdt=mgr_dt));
set table1 (keep=rank_no crsp_fundno mgr_dt);
by rank_no crsp_fundno;
format oldmgrdt caldt yymmddd10.;
oldmgrdt = intnx('month',lag(mgr_dt),0,'end');
if not first.crsp_fundno
then do;
do caldt = oldmgrdt to mgr_dt - 1;
output;
end;
end;
if last.crsp_fundno
then do;
oldmgrdt = mgr_dt;
do caldt = mgr_dt to &maxdate;
output;
end;
end;
run;
data table3;
merge
t1_int (in=a)
table2 (in=b)
;
by rank_no crsp_fundno caldt;
if b;
run;
proc print data=table3 noobs;
run;
And note, once again, how to provide test data so that your possible helpers do not have to waste time typing those steps themselves. Its not rocket science and just a matter of basic forum courtesy.
This is the result:
crsp_ rank_no fundno caldt mret mgr_dt 1 53 1990-11-30 1.0459 . 1 53 1990-12-31 1.0250 . 1 53 1991-01-31 1.0435 1991-01-31 1 53 1991-02-28 1.0499 1991-01-31 1 53 1991-03-28 1.0078 1991-01-31 1 53 1999-01-29 0.9810 1999-01-01 1 53 1999-02-26 1.0002 1999-01-01 1 53 1999-03-31 1.0238 1999-01-01 2 105 1984-12-31 1.0238 . 2 105 1985-01-31 1.0238 1985-01-31 2 105 1985-02-28 1.0238 1985-01-31 2 105 1985-03-29 1.0238 1985-01-31 2 105 1985-04-30 1.0238 1985-01-31 2 105 1985-05-31 1.0238 1985-01-31 2 105 1985-06-28 1.0238 1985-06-01 2 105 1985-07-31 0.9990 1985-06-01
1) Are those apparent data variables SAS date values or character?
2) Is the implied rule that the table 2 caldt fall between the table 1 offer and caldt within the ID?
2a) if between the dates what happens when there are two or more caldt in table 2 within the range of offer - caldt in table 1? It looks like each one is in the result but want to clarify that is the case.
1)the date here are sas variables YYMMDDN8.
2) in reality you can ignore the caldt in table 1. I want that in the joint table the offer begin when it equal the caldt and continue so until a new value of offer begin
@sasphd wrote:
1)the date here are sas variables YYMMDDN8.
2) in reality you can ignore the caldt in table 1. I want that in the joint table the offer begin when it equal the caldt and continue so until a new value of offer begin
Then your desired result does not match your criteria, as the second offer date is larger than all caldt's in table 2.
On the assumption that your example table 2 is simply not complete, and that caldt in table 2 always falls on the last day of a month, I did this:
data t1;
input id offer :yymmdd10. caldt :yymmdd10.;
format offer caldt yymmddd10.;
cards;
1 1997-02-01 1997-12-31
1 1999-02-01 1998-12-31
;
run;
data t2;
input id mret caldt :yymmdd10.;
format caldt yymmddd10.;
cards;
1 0.09 1997-12-31
1 0.87 1998-01-31
1 0.89 1998-12-31
1 0.34 1999-01-31
1 0.35 1999-02-28
1 0.40 1999-03-31
;
run;
%let maxdate=%sysfunc(date());
data t1_int (keep=id oldoffer caldt rename=(oldoffer=offer));
set t1 (keep=id offer);
by id;
format oldoffer caldt yymmddd10.;
oldoffer = lag(offer);
if not first.id
then do;
caldt = intnx('month',oldoffer,0,'end');
do until (caldt > offer);
output;
caldt = intnx('month',caldt,1,'end');
end;
end;
if last.id
then do;
caldt = intnx('month',offer,0,'end');
oldoffer = offer;
do until (caldt > &maxdate);
output;
caldt = intnx('month',caldt,1,'end');
end;
end;
run;
data want;
merge
t1_int (in=a)
t2 (in=b)
;
by id caldt;
if b;
run;
Basically, I create a lookup table for all possible end-of-month dates from the offers, and merged that with table 2.
If caldt could fall on other days than end-of-month, one could use
do caldt = oldoffer to offer - 1;
and create a lookup table for all dates, but that would of course be quite large.
If the IDs are differents (You use in the example only ID 1) use a PROC SQL, but watch out with what's your primary table, because the observations of table 2 with an ID that don't appear in table 1 will be ignored.
proc sql;
create table table_3 as
select t1.*,
t2.mret
from table_1 t1
left join table_2 t2 on (t1.id = t2.id)
quit;
your solution give me this resut (real data). however in table 3 the caldt is repeted as number of MGR_DT.
How can I eliminate this!!!!!!!!!!!!!!!!!
this is the program and the data tables
proc sql;
create table table_3 as
select t2.*,
t1.mret, t1.caldt
from table2 t1
left join table1 t2 on (t1.crsp_fundno = t2.crsp_fundno)
order by t1.crsp_fundno, t1.Caldt;
quit;
your solution give me this resut (real data). however in table 3 the caldt is repeted as number of MGR_DT. I want distinct caldt for each id
How can I eliminate this!!!!!!!!!!!!!!!!!
this is the program and the data tables
proc sql;
create table table_3 as
select t2.*,
t1.mret, t1.caldt
from table2 t1
left join table1 t2 on (t1.crsp_fundno = t2.crsp_fundno)
order by t1.crsp_fundno, t1.Caldt;
quit;
Hello!
When you use the Left Join, you will use all observations of the declared table 1 (t1) and merge with the observations that match with key (in your case crsp_fundno) of table 2 (t2), and ignore the rest.
How I say in my first comment, it's better to implement this method when you have an unique key (and ID or something), but you hasn't it, because crsp_fundno repeat itself a great amount of times.
The thinks what your codes do is the next:
Your t1 is table2, where do you take only caldt and mret. Then, as table2 is the t1, the program put all the results of this table.
Meanwhile, table1 is t2, so, it will put the result that coincide with the key of t1 (crsp_fundno), and, if t1 have less results (as it is), it will begin to repeat this results, as you see in the graphic bellow.
The crsp_fundno 53 has a lot of results in table2 (t1), and only two in table1 (t2), one with the mgr_dt equal to 01/01/1999 and the other with 01/01/1991. Because of that, the program begin two repeat this result until it fulfill with the t1 results amount. And that is because you have this problem.
Now, if you use this software in other way, with table1 as t1, and table2 as t2. It will take only two results of t2 and ignore the rest (that is a lot). There is a grapich of more joins that sql do:
And with your question. You want differents caidt for each crsp_fundno, but you have a lot of equals crsp_fundno. You want to eliminate duplicate or something?
Sorry for the long text, but I want to explain the code that you do, and why you have that result. But really I don't understand what you want with the crsp_fundno.
thanks a lot
this is the table 3 that I want.
I want that MGR_DT will be repeat for each crsp_fundno for the all period and it began when year and month of MGR_DT equal the caldt.
Many users here don't want to download Excel files because of virus potential, others have such things blocked by security software. Also if you give us Excel we have to create a SAS data set and due to the non-existent constraints on Excel data cells the result we end up with may not have variables of the same type (numeric or character) and even values.
Post example data in the form of data step code. You can find instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... to create datastep code from a SAS data set. Post the results as text in the {i} code boxes or attach as TXT files.
Please post example data in a data step. I'll never touch Excel files from the internet.
able 1
Rank_NO crsp_fundno MGR_DT
1 53 1991-01-01
1 53 1999-01-01
2 105 1985-01-01
2 105 1985-06-01
table 2
Rank_NO | crsp_fundno | caldt | mret |
1 | 53 | 1990-11-30 | 1,0459 |
1 | 53 | 1990-12-31 | 1,0250 |
1 | 53 | 1991-01-31 | 1,0435 |
1 | 53 | 1991-02-28 | 1,0499 |
1 | 53 | 1991-03-28 | 1,0078 |
1 | 53 | 1999-01-29 | 0,9810 |
1 | 53 | 1999-02-26 | 1,0002 |
1 | 53 | 1999-03-31 | 1,0238 |
2 | 105 | 1984-12-31 | 1,0238 |
2 | 105 | 1985-01-31 | 1,0238 |
2 | 105 | 1985-02-28 | 1,0238 |
2 | 105 | 1985-03-29 | 1,0238 |
2 | 105 | 1985-04-30 | 1,0238 |
2 | 105 | 1985-05-31 | 1,0238 |
2 | 105 | 1985-06-28 | 1,0238 |
2 | 105 | 1985-07-31 | 0,9990 |
I want table 3
Rank_NO | crsp_fundno | caldt | mret | MGR_DT |
1 | 53 | 1990-11-30 | 1,0459 | |
1 | 53 | 1990-12-31 | 1,0250 | |
1 | 53 | 1991-01-31 | 1,0435 | 1991-01-01 |
1 | 53 | 1991-02-28 | 1,0499 | 1991-01-01 |
1 | 53 | 1991-03-28 | 1,0078 | 1991-01-01 |
1 | 53 | 1999-01-29 | 0,9810 | 1999-01-01 |
1 | 53 | 1999-02-26 | 1,0002 | 1999-01-01 |
1 | 53 | 1999-03-31 | 1,0238 | 1999-01-01 |
2 | 105 | 1984-12-31 | 1,0238 | |
2 | 105 | 1985-01-31 | 1,0238 | 1985-01-01 |
2 | 105 | 1985-02-28 | 1,0238 | 1985-01-01 |
2 | 105 | 1985-03-29 | 1,0238 | 1985-01-01 |
2 | 105 | 1985-04-30 | 1,0238 | 1985-01-01 |
2 | 105 | 1985-05-31 | 1,0238 | 1985-01-01 |
2 | 105 | 1985-06-28 | 1,0238 | 1985-06-01 |
2 | 105 | 1985-07-31 | 0,999 | 1985-06-01 |
So it's not always the last of the month. In this case, our lookup table best covers all days.
data table1;
input rank_no crsp_fundno mgr_dt :yymmdd10.;
format mgr_dt yymmddd10.;
cards;
1 53 1991-01-01
1 53 1999-01-01
2 105 1985-01-01
2 105 1985-06-01
;
run;
data table2;
input rank_no crsp_fundno caldt :yymmdd10. mret;
format caldt yymmddd10.;
cards;
1 53 1990-11-30 1.0459
1 53 1990-12-31 1.0250
1 53 1991-01-31 1.0435
1 53 1991-02-28 1.0499
1 53 1991-03-28 1.0078
1 53 1999-01-29 0.9810
1 53 1999-02-26 1.0002
1 53 1999-03-31 1.0238
2 105 1984-12-31 1.0238
2 105 1985-01-31 1.0238
2 105 1985-02-28 1.0238
2 105 1985-03-29 1.0238
2 105 1985-04-30 1.0238
2 105 1985-05-31 1.0238
2 105 1985-06-28 1.0238
2 105 1985-07-31 0.9990
;
run;
%let maxdate=%sysfunc(date());
data t1_int (keep=rank_no crsp_fundno oldmgrdt caldt rename=(oldmgrdt=mgr_dt));
set table1 (keep=rank_no crsp_fundno mgr_dt);
by rank_no crsp_fundno;
format oldmgrdt caldt yymmddd10.;
oldmgrdt = intnx('month',lag(mgr_dt),0,'end');
if not first.crsp_fundno
then do;
do caldt = oldmgrdt to mgr_dt - 1;
output;
end;
end;
if last.crsp_fundno
then do;
oldmgrdt = mgr_dt;
do caldt = mgr_dt to &maxdate;
output;
end;
end;
run;
data table3;
merge
t1_int (in=a)
table2 (in=b)
;
by rank_no crsp_fundno caldt;
if b;
run;
proc print data=table3 noobs;
run;
And note, once again, how to provide test data so that your possible helpers do not have to waste time typing those steps themselves. Its not rocket science and just a matter of basic forum courtesy.
This is the result:
crsp_ rank_no fundno caldt mret mgr_dt 1 53 1990-11-30 1.0459 . 1 53 1990-12-31 1.0250 . 1 53 1991-01-31 1.0435 1991-01-31 1 53 1991-02-28 1.0499 1991-01-31 1 53 1991-03-28 1.0078 1991-01-31 1 53 1999-01-29 0.9810 1999-01-01 1 53 1999-02-26 1.0002 1999-01-01 1 53 1999-03-31 1.0238 1999-01-01 2 105 1984-12-31 1.0238 . 2 105 1985-01-31 1.0238 1985-01-31 2 105 1985-02-28 1.0238 1985-01-31 2 105 1985-03-29 1.0238 1985-01-31 2 105 1985-04-30 1.0238 1985-01-31 2 105 1985-05-31 1.0238 1985-01-31 2 105 1985-06-28 1.0238 1985-06-01 2 105 1985-07-31 0.9990 1985-06-01
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.