Help using Base SAS procedures

join two table

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 118
Accepted Solution

join two table

[ Edited ]

Hello, 

I want to join this two tables

Table 1 

id  offer            caldt
1 1997-02-01 1997-12-31
1 1999-02-01 1998-12-31

 

table 2 

id  mret            caldt

1   0.09          1997-12-31

 1   0.87         1998-01-31

...........

1    0.89           1998-12-31

 1     0.34              1999-01-31

 

result table 

 

id  mret            caldt              offer

1   0.09          1997-12-31       1997-02-01

 1   0.87         1998-01-31       1997-02-01

...........

1    0.89           1998-12-31        1999-02-01

1     0.34              1999-01-31     1999-02-01


Accepted Solutions
Solution
‎01-12-2017 09:59 AM
Super User
Posts: 7,757

Re: join two table

So it's not always the last of the month. In this case, our lookup table best covers all days.

data table1;
input rank_no crsp_fundno mgr_dt :yymmdd10.;
format mgr_dt yymmddd10.;
cards;
1 53 1991-01-01
1 53 1999-01-01
2 105 1985-01-01
2 105 1985-06-01
;
run;

data table2;
input rank_no crsp_fundno caldt :yymmdd10. mret;
format caldt yymmddd10.;
cards;
1 53 1990-11-30 1.0459
1 53 1990-12-31 1.0250
1 53 1991-01-31 1.0435
1 53 1991-02-28 1.0499
1 53 1991-03-28 1.0078
1 53 1999-01-29 0.9810
1 53 1999-02-26 1.0002
1 53 1999-03-31 1.0238
2 105 1984-12-31 1.0238
2 105 1985-01-31 1.0238
2 105 1985-02-28 1.0238
2 105 1985-03-29 1.0238
2 105 1985-04-30 1.0238
2 105 1985-05-31 1.0238
2 105 1985-06-28 1.0238
2 105 1985-07-31 0.9990
;
run;

%let maxdate=%sysfunc(date());

data t1_int (keep=rank_no crsp_fundno oldmgrdt caldt rename=(oldmgrdt=mgr_dt));
set table1 (keep=rank_no crsp_fundno mgr_dt);
by rank_no crsp_fundno;
format oldmgrdt caldt yymmddd10.;
oldmgrdt = intnx('month',lag(mgr_dt),0,'end');
if not first.crsp_fundno
then do;
  do caldt = oldmgrdt to mgr_dt - 1;
    output;
  end;
end;
if last.crsp_fundno
then do;
  oldmgrdt = mgr_dt;
  do caldt = mgr_dt to &maxdate;
    output;
  end;
end;
run;

data table3;
merge
  t1_int (in=a)
  table2 (in=b)
;
by rank_no crsp_fundno caldt;
if b;
run;

proc print data=table3 noobs;
run;

And note, once again, how to provide test data so that your possible helpers do not have to waste time typing those steps themselves. Its not rocket science and just a matter of basic forum courtesy.

 

This is the result:

            crsp_
rank_no    fundno         caldt     mret         mgr_dt

   1          53     1990-11-30    1.0459             .
   1          53     1990-12-31    1.0250             .
   1          53     1991-01-31    1.0435    1991-01-31
   1          53     1991-02-28    1.0499    1991-01-31
   1          53     1991-03-28    1.0078    1991-01-31
   1          53     1999-01-29    0.9810    1999-01-01
   1          53     1999-02-26    1.0002    1999-01-01
   1          53     1999-03-31    1.0238    1999-01-01
   2         105     1984-12-31    1.0238             .
   2         105     1985-01-31    1.0238    1985-01-31
   2         105     1985-02-28    1.0238    1985-01-31
   2         105     1985-03-29    1.0238    1985-01-31
   2         105     1985-04-30    1.0238    1985-01-31
   2         105     1985-05-31    1.0238    1985-01-31
   2         105     1985-06-28    1.0238    1985-06-01
   2         105     1985-07-31    0.9990    1985-06-01
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers

View solution in original post


All Replies
Super User
Posts: 11,336

Re: join two table

1) Are those apparent data variables SAS date values or character?

2) Is the implied rule that the table 2 caldt fall between the table 1 offer and caldt within the ID?

   2a) if between the dates what happens when there are two or more caldt in table 2 within the range of offer - caldt in table 1? It looks like each one is in the result but want to clarify that is the case.

Frequent Contributor
Posts: 118

Re: join two table

1)the date here are sas variables YYMMDDN8.

 

2) in reality you can ignore the caldt in table 1. I want that in the joint table the offer begin when it equal the caldt and continue so until a new value of offer begin 

 

 

 

Super User
Posts: 7,757

Re: join two table


sasphd wrote:

1)the date here are sas variables YYMMDDN8.

 

2) in reality you can ignore the caldt in table 1. I want that in the joint table the offer begin when it equal the caldt and continue so until a new value of offer begin 

 

 

 


Then your desired result does not match your criteria, as the second offer date is larger than all caldt's in table 2.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Posts: 7,757

Re: join two table

On the assumption that your example table 2 is simply not complete, and that caldt in table 2 always falls on the last day of a month, I did this:

data t1;
input id offer :yymmdd10. caldt :yymmdd10.;
format offer caldt yymmddd10.;
cards;
1 1997-02-01 1997-12-31
1 1999-02-01 1998-12-31
;
run;

data t2;
input id mret caldt :yymmdd10.;
format caldt yymmddd10.;
cards;
1 0.09 1997-12-31
1 0.87 1998-01-31
1 0.89 1998-12-31
1 0.34 1999-01-31
1 0.35 1999-02-28
1 0.40 1999-03-31
;
run;

%let maxdate=%sysfunc(date());

data t1_int (keep=id oldoffer caldt rename=(oldoffer=offer));
set t1 (keep=id offer);
by id;
format oldoffer caldt yymmddd10.;
oldoffer = lag(offer);
if not first.id
then do;
  caldt = intnx('month',oldoffer,0,'end');
  do until (caldt > offer);
    output;
    caldt = intnx('month',caldt,1,'end');
  end;
end;
if last.id
then do;
  caldt = intnx('month',offer,0,'end');
  oldoffer = offer;
  do until (caldt > &maxdate);
    output;
    caldt = intnx('month',caldt,1,'end');
  end;
end;
run;

data want;
merge
  t1_int (in=a)
  t2 (in=b)
;
by id caldt;
if b;
run;

Basically, I create a lookup table for all possible end-of-month dates from the offers, and merged that with table 2.

If caldt could fall on other days than end-of-month, one could use

 do caldt = oldoffer to offer - 1;

and create a lookup table for all dates, but that would of course be quite large.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Occasional Contributor
Posts: 10

Re: join two table

[ Edited ]

If the IDs are differents (You use in the example only ID 1) use a PROC SQL, but watch out with what's your primary table, because the observations of table 2 with an ID that don't appear in table 1 will be ignored.

 

proc sql;
   create table table_3 as
   select t1.*,
          t2.mret
   from table_1 t1
   left join table_2 t2 on (t1.id = t2.id)
quit;

 

Frequent Contributor
Posts: 118

Re: join two table

Posted in reply to Feragon42

your solution give me this resut (real data). however in table 3 the caldt is repeted as number of MGR_DT. 

How can I eliminate this!!!!!!!!!!!!!!!!!

this is the program and the data tables

proc sql;
create table table_3 as
select t2.*,
t1.mret, t1.caldt
from table2 t1
left join table1 t2 on (t1.crsp_fundno = t2.crsp_fundno)
order by t1.crsp_fundno, t1.Caldt;
quit;

Frequent Contributor
Posts: 118

Re: join two table

[ Edited ]
Posted in reply to Feragon42

your solution give me this resut (real data). however in table 3 the caldt is repeted as number of MGR_DT. I want distinct caldt for each id

How can I eliminate this!!!!!!!!!!!!!!!!!

this is the program and the data tables

proc sql;
create table table_3 as
select t2.*,
t1.mret, t1.caldt
from table2 t1
left join table1 t2 on (t1.crsp_fundno = t2.crsp_fundno)
order by t1.crsp_fundno, t1.Caldt;
quit;

Occasional Contributor
Posts: 10

Re: join two table

Hello!

When you use the Left Join, you will use all observations of the declared table 1 (t1) and merge with the observations that match with key (in your case crsp_fundno) of table 2 (t2), and ignore the rest.

 img_leftjoin.gif

 

How I say in my first comment, it's better to implement this method when you have an unique key (and ID or something), but you hasn't it, because crsp_fundno repeat itself a great amount of times. 

The thinks what your codes do is the next:

Your t1 is table2, where do you take only caldt and mret. Then, as table2 is the t1, the program put all the results of this table.

Meanwhile, table1 is t2, so, it will put the result that coincide with the key of t1 (crsp_fundno), and, if t1 have less results (as it is), it will begin to repeat this results, as you see in the graphic bellow.

The crsp_fundno 53 has a lot of results in table2 (t1), and only two in table1 (t2), one with the mgr_dt equal to 01/01/1999 and the other with 01/01/1991. Because of that, the program begin two repeat this result until it fulfill with the t1 results amount. And that is because you have this problem.

Captura.PNG

Now, if you use this software in other way, with table1 as t1, and table2 as t2. It will take only two results of t2 and ignore the rest (that is a lot). There is a grapich of more joins that sql do:

SQL Joins.jpg

 

And with your question. You want differents caidt for each crsp_fundno, but you have a lot of equals crsp_fundno. You want to eliminate duplicate or something?

Sorry for the long text, but I want to explain the code that you do, and why you have that result. But really I don't understand what you want with the crsp_fundno. 

Frequent Contributor
Posts: 118

Re: join two table

Posted in reply to Feragon42

thanks a lot 

this is the table 3 that I want. 

I want that MGR_DT will be repeat for each crsp_fundno for the all period and it began when year and month of MGR_DT equal the caldt. 

 

Super User
Posts: 11,336

Re: join two table

Many users here don't want to download Excel files because of virus potential, others have such things blocked by security software. Also if you give us Excel we have to create a SAS data set and due to the non-existent constraints on Excel data cells the result we end up with may not have variables of the same type (numeric or character) and even values.

 

Post example data in the form of data step code. You can find instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... to create datastep code from a SAS data set. Post the results as text in the {i} code boxes or attach as TXT files.

Super User
Posts: 7,757

Re: join two table

Please post example data in a data step. I'll never touch Excel files from the internet.

 

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Frequent Contributor
Posts: 118

Re: join two table

Posted in reply to KurtBremser

able 1

 

Rank_NO crsp_fundno MGR_DT
1                      53           1991-01-01
1                      53            1999-01-01
2                    105             1985-01-01
2                    105             1985-06-01

 

 

table 2

Rank_NO crsp_fundno caldt mret
1 53 1990-11-30 1,0459
1 53 1990-12-31 1,0250
1 53 1991-01-31 1,0435
1 53 1991-02-28 1,0499
1 53 1991-03-28 1,0078
1 53 1999-01-29 0,9810
1 53 1999-02-26 1,0002
1 53 1999-03-31 1,0238
2 105 1984-12-31 1,0238
2 105 1985-01-31 1,0238
2 105 1985-02-28 1,0238
2 105 1985-03-29 1,0238
2 105 1985-04-30 1,0238
2 105 1985-05-31 1,0238
2 105 1985-06-28 1,0238
2 105 1985-07-31 0,9990

 

 

I want table 3 

Rank_NO crsp_fundno caldt mret MGR_DT
1 53 1990-11-30 1,0459  
1 53 1990-12-31 1,0250  
1 53 1991-01-31 1,0435 1991-01-01
1 53 1991-02-28 1,0499 1991-01-01
1 53 1991-03-28 1,0078 1991-01-01
1 53 1999-01-29 0,9810 1999-01-01
1 53 1999-02-26 1,0002 1999-01-01
1 53 1999-03-31 1,0238 1999-01-01
2 105 1984-12-31 1,0238  
2 105 1985-01-31 1,0238 1985-01-01
2 105 1985-02-28 1,0238 1985-01-01
2 105 1985-03-29 1,0238 1985-01-01
2 105 1985-04-30 1,0238 1985-01-01
2 105 1985-05-31 1,0238 1985-01-01
2 105 1985-06-28 1,0238 1985-06-01
2 105 1985-07-31 0,999 1985-06-01
Solution
‎01-12-2017 09:59 AM
Super User
Posts: 7,757

Re: join two table

So it's not always the last of the month. In this case, our lookup table best covers all days.

data table1;
input rank_no crsp_fundno mgr_dt :yymmdd10.;
format mgr_dt yymmddd10.;
cards;
1 53 1991-01-01
1 53 1999-01-01
2 105 1985-01-01
2 105 1985-06-01
;
run;

data table2;
input rank_no crsp_fundno caldt :yymmdd10. mret;
format caldt yymmddd10.;
cards;
1 53 1990-11-30 1.0459
1 53 1990-12-31 1.0250
1 53 1991-01-31 1.0435
1 53 1991-02-28 1.0499
1 53 1991-03-28 1.0078
1 53 1999-01-29 0.9810
1 53 1999-02-26 1.0002
1 53 1999-03-31 1.0238
2 105 1984-12-31 1.0238
2 105 1985-01-31 1.0238
2 105 1985-02-28 1.0238
2 105 1985-03-29 1.0238
2 105 1985-04-30 1.0238
2 105 1985-05-31 1.0238
2 105 1985-06-28 1.0238
2 105 1985-07-31 0.9990
;
run;

%let maxdate=%sysfunc(date());

data t1_int (keep=rank_no crsp_fundno oldmgrdt caldt rename=(oldmgrdt=mgr_dt));
set table1 (keep=rank_no crsp_fundno mgr_dt);
by rank_no crsp_fundno;
format oldmgrdt caldt yymmddd10.;
oldmgrdt = intnx('month',lag(mgr_dt),0,'end');
if not first.crsp_fundno
then do;
  do caldt = oldmgrdt to mgr_dt - 1;
    output;
  end;
end;
if last.crsp_fundno
then do;
  oldmgrdt = mgr_dt;
  do caldt = mgr_dt to &maxdate;
    output;
  end;
end;
run;

data table3;
merge
  t1_int (in=a)
  table2 (in=b)
;
by rank_no crsp_fundno caldt;
if b;
run;

proc print data=table3 noobs;
run;

And note, once again, how to provide test data so that your possible helpers do not have to waste time typing those steps themselves. Its not rocket science and just a matter of basic forum courtesy.

 

This is the result:

            crsp_
rank_no    fundno         caldt     mret         mgr_dt

   1          53     1990-11-30    1.0459             .
   1          53     1990-12-31    1.0250             .
   1          53     1991-01-31    1.0435    1991-01-31
   1          53     1991-02-28    1.0499    1991-01-31
   1          53     1991-03-28    1.0078    1991-01-31
   1          53     1999-01-29    0.9810    1999-01-01
   1          53     1999-02-26    1.0002    1999-01-01
   1          53     1999-03-31    1.0238    1999-01-01
   2         105     1984-12-31    1.0238             .
   2         105     1985-01-31    1.0238    1985-01-31
   2         105     1985-02-28    1.0238    1985-01-31
   2         105     1985-03-29    1.0238    1985-01-31
   2         105     1985-04-30    1.0238    1985-01-31
   2         105     1985-05-31    1.0238    1985-01-31
   2         105     1985-06-28    1.0238    1985-06-01
   2         105     1985-07-31    0.9990    1985-06-01
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 13 replies
  • 739 views
  • 2 likes
  • 4 in conversation