Hello! When you use the Left Join, you will use all observations of the declared table 1 (t1) and merge with the observations that match with key (in your case crsp_fundno) of table 2 (t2), and ignore the rest. How I say in my first comment, it's better to implement this method when you have an unique key (and ID or something), but you hasn't it, because crsp_fundno repeat itself a great amount of times. The thinks what your codes do is the next: Your t1 is table2, where do you take only caldt and mret. Then, as table2 is the t1, the program put all the results of this table. Meanwhile, table1 is t2, so, it will put the result that coincide with the key of t1 (crsp_fundno), and, if t1 have less results (as it is), it will begin to repeat this results, as you see in the graphic bellow. The crsp_fundno 53 has a lot of results in table2 (t1), and only two in table1 (t2), one with the mgr_dt equal to 01/01/1999 and the other with 01/01/1991. Because of that, the program begin two repeat this result until it fulfill with the t1 results amount. And that is because you have this problem. Now, if you use this software in other way, with table1 as t1, and table2 as t2. It will take only two results of t2 and ignore the rest (that is a lot). There is a grapich of more joins that sql do: And with your question. You want differents caidt for each crsp_fundno, but you have a lot of equals crsp_fundno. You want to eliminate duplicate or something? Sorry for the long text, but I want to explain the code that you do, and why you have that result. But really I don't understand what you want with the crsp_fundno.
... View more