DATA Step, Macro, Functions and more

Merging two data sets

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 19
Accepted Solution

Merging two data sets

Hi

I would like to merge two data sets.

The first data set looks like this.

cusip         date_crsp      ret

000361     1990Q1         0.2

000361     1990Q2         0.12

000361     1990Q3         -0.06

000361     1990Q4         -0.02

The second data set looks like this.

cusip         date_crsp      sales

000361     1990Q1         1240

000361     1990Q2         4673

000361     1990Q3         3467

The output should look like this.

cusip         date_crsp     return     sales

000361     1990Q1        0.2          1240 

000361     1990Q2        0.12        4673

000361     1990Q3        -0.06       3467

000361     1990Q4        -0.02           .

But my output looks like this.

cusip         date_crsp     ret     sales

000361     1990Q1        .         1240  

000361     1990Q1        0.2      .

000361     1990Q2        .          4673 

000361     1990Q2        0.12    .

000361     1990Q3        .          3467

000361     1990Q3        -0.06   . 

000361     1990Q4        -0.02   .

The by variables are cusip and date_crsp.

The cusip in the two data sets have the same length, and the date_crsp have the same format.

My code is also not too complicated, so I do not know where the mistake is. I did also the proc sort for the two data sets for cusip and date_crsp.

I used the following code:

data combined;

merge dataset1 dataset2;

by cusip date_crsp;

run;

Thanks for your help!!


Accepted Solutions
Solution
‎05-29-2013 08:41 AM
Super User
Super User
Posts: 6,499

Re: Merging two data sets

The values are probably NOT the same in the two datasets. Looks like you have date variables formatted to display as year and quarter.  So there are about 90 different values the date variables could have to make them look like they have the same value when displayed as quarters.

Try either converting your variables to character strings or convert all values to the first day of the quarter.

View solution in original post


All Replies
Solution
‎05-29-2013 08:41 AM
Super User
Super User
Posts: 6,499

Re: Merging two data sets

The values are probably NOT the same in the two datasets. Looks like you have date variables formatted to display as year and quarter.  So there are about 90 different values the date variables could have to make them look like they have the same value when displayed as quarters.

Try either converting your variables to character strings or convert all values to the first day of the quarter.

Occasional Contributor
Posts: 19

Re: Merging two data sets

Hi Tom

Thank you for the hint. Now the merge command works perfectly.

Super Contributor
Posts: 358

Re: Merging two data sets

Hi:

If you only need the merged data then try:

data combined;

merge dataset1  (in=a)

           dataset2  (in=b)

by cusip date_crsp;

if a and b then output;

               else delete;

run;

I prefer to do:

proc sql;

create table combined as

select distinct a.*,

                       b.sales

from dataset1 as a,

         dataset2 as b

where a.cusip=b.cusip and

          a.date_crsp=b.date_crsp;

quit;

run;

Occasional Contributor
Posts: 12

Re: Merging two data sets

Nice solution OS, I would like to add that Merge is much faster than SQL but even I prefer SQL because its simpler and more dependable but an inner join instead of this join would have been faster however both will give the same answer

Thanks Njoi

Occasional Contributor
Posts: 10

Re: Merging two data sets

proc sort data=first;

by  cusip  date_crsp ;

run;   

proc sort data=second;

by  cusip  date_crsp ;

run;   

data final;

merge first(in=a) second;

by  cusip  date_crsp ;

if a;

run;

it works

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 273 views
  • 7 likes
  • 5 in conversation