Merging 2 Datasets

Reply
New Contributor
Posts: 3

Merging 2 Datasets

I have 2 data sets I want to merge.  One of the datasets has date formatted as 2 columns with year as one column and month as the other column, so it looks like this:

yearmonth
20026
20027
20028
20029

The other dataset has date under just 1 column with the format as yyyymmdd, for example the date for today is written as 20141212.  The dates for this datatset are all dates for the last day of the month, ie. 19881031, 19881130, 19881230. 19890131.  How do I format this second dataset so it is the same as the first dataset and then merge the 2.  I'm thinking there is some way to assign the first 4 digits as the year and the next 2 digits as the month

Thanks!

Super User
Posts: 19,038

Re: Merging 2 Datasets

Is the date variable a SAS date, ie numeric with a format of yymmdd8., or a character variable.

A sample of the second table is helpful as well.

What do you want your output to look like?

New Contributor
Posts: 3

Re: Merging 2 Datasets

Sorry I don't think my explanation is quite clear, I'll try to go more in depth.

So my first dataset looks like this

Permnoyearmonth# of purchasesamount purchased
39220200261135135
39220200271154625
3922020028325625645
3922020029523451345
3922120026623452134
3922120027114324245
39221200284124145254

The second dataset looks like this

permnodatereturn
39220200206300.064382
3922020020731-0.061234
39220200208310.001320
39220200209300.133154
39221200206300.021543
39221200207310.018902
39221200208310.114314

And I want the output to be

permnoyearmonth# of purchasesamount purchasedreturn
392202002611351350.064382
39220200271154625-0.061234
39220200283256256450.001320

and so forth...

The data variable is a SAS date with the format YYMMDDN8.

Respected Advisor
Posts: 4,130

Re: Merging 2 Datasets

data have1;

  infile datalines;

  input permno year month purch_cnt purch_amt;

  datalines;

39220 2002 6 1 135135

39220 2002 7 1 154625

39220 2002 8 3 25625645

39220 2002 9 5 23451345

39221 2002 6 6 23452134

39221 2002 7 1 14324245

39221 2002 8 4 124145254

;

run;

data have2;

  infile datalines;

  input permno date:yymmdd. return;

  format date date9.;

  datalines;

39220 20020630 0.064382

39220 20020731 -0.061234

39220 20020831 0.001320

39220 20020930 0.133154

39221 20020630 0.021543

39221 20020731 0.018902

39221 20020831 0.114314

;

run;

proc sql;

  create table want as

  select h1.permno, h1.year, h1.month, h1.purch_cnt, h1.purch_amt, h2.return

  from have1 h1 left join have2 h2

    on h1.permno=h2.permno and h1.year=year(h2.date) and h1.month=month(h2.date)

  order by h1.permno, h1.year, h1.month

  ;

quit;

New Contributor
Posts: 3

Re: Merging 2 Datasets

Patrick thank you for your help. Unfortunately the datasets I have are thousands of lines, I only included the first couple to show what they looked.  I don't think it is feasible for me to manually input the datasets.  I am in an introductory course to mysql and in a previous assignment, I have used the following code to convert dates in the yyyymmdd format to yyyymm, the solution may be something similar to this:

data msf;

      set lib361.msf;

      where 0 < hexcd < 3;

      yyyymm=year(date)*100+month(date);

proc sort data=msf;

      by yyyymm;

proc means data=msf noprint;

      by yyyymm;

      output out=d_mkt_return

            mean(ret)=market_ret;

proc sql;

create table d as

select

      a.permno,

      a.ret,

      a.yyyymm

from

    msf as a

where

      196404<=a.yyyymm<=196506;

run;

Respected Advisor
Posts: 4,130

Re: Merging 2 Datasets

The 2 data sets Have1 & Have2 create the sample source data based on what you've posted. You say that you're already having 2 data sets so you would only need the SQL statement creating the want data set.

Not sure what this has to do with mysql. Nowhere in your code samples (using SAS tables in WORK) nor descriptions is anything said about having the data in a database.

In general: In order to join the 2 source tables you either need to derive the Year and Month from the Date variable (that's what I've done) or you need to combine and convert the Year and Month variablse to an end-of-month Date value (using a combination of MDY() and INTNX() function).

Super User
Posts: 9,856

Re: Merging 2 Datasets

For your scenario , option GROUPFOTMAT is a good choice.

data have1;
  infile datalines;
  input permno year month purch_cnt purch_amt;
  datalines;
39220 2002 6 1 135135
39220 2002 7 1 154625
39220 2002 8 3 25625645
39220 2002 9 5 23451345
39221 2002 6 6 23452134
39221 2002 7 1 14324245
39221 2002 8 4 124145254
;
run;
 
data have2;
  infile datalines;
  input permno date:yymmdd. return;
  format date date9.;
  datalines;
39220 20020630 0.064382
39220 20020731 -0.061234
39220 20020831 0.001320
39220 20020930 0.133154
39221 20020630 0.021543
39221 20020731 0.018902
39221 20020831 0.114314
;
run;

data have1;
 set have1;
 date=mdy(month,1,year);
run;

data want;
 merge have1(in=ina) have2(in=inb);
 by permno date groupformat;
 format date monyy7.;
 if ina and inb;
run;

Xia Keshan

Ask a Question
Discussion stats
  • 6 replies
  • 252 views
  • 0 likes
  • 4 in conversation