Hi all, I have 2 datasets: 1. contains approximately 25 million observations of 5 variables: MutualfundID, year, month, assetID and holdings. Each month of each year in my dataset it shows what the holdings of these funds in multiple assets are. So this contains several observations fora specfic mutualfund in a specific month; because the have positions in several assets. 2. contains 250.000 observations of 4 variables. Also: MutualfundID, year, month and then one other var: netfundflows. So this contains only 1 observation for a specific fund in a specific month because in shows net flows. What I want to accomplish is that those fundflows come in as a 6th variable in the first dataset. Because each fund has multiple assets each month, the net fund flows will be repeated several times in that month because it relates to the fund itself and not to the asset. I tried: data merged; merge dataset1 dataset2; by MutualfundID, year, month; run; But this doesnt give me the right results. How should I approach this?
... View more