Merging data sets that have different range for 'by variable'

Reply
Contributor
Posts: 22

Merging data sets that have different range for 'by variable'

Hey all,

I am trying to merge two data sets, but they have different range in dates and port_no (my class variables, i think that's what they're called).

The two data sets I have are: have1

Date               Identifier          Ret

011980               3424               2.4

021980               3543                4.3

031980               432

...

...

121980               4332

011981                423

....

...

...

112011

And the other one is the yearly one, have2

Date               Identifier          Value

1979                    432               4.2

1980                    4332    

1981                    423

1979

1980

1981

....

....

2011

So I want to merge have1 and have2. I want them to be monthly data, so I think, first, i need to convert have2 into monthly data assigning equal Value to each month for that year. I tried to use Proc Expand, but it didn't work.

And then I want to merge the two, only if the Identifier in have2 also exists in have1. I don't want data's of Identifier that are not in have1. I tried this with the regular Merge, and thought about deleting the observation of Identifiers not in original have1, but that messed up my data and couldn't do it.

Anyway around it? Thanks everyone.

Regular Contributor
Posts: 233

Re: Merging data sets that have different range for 'by variable'

data have1;
input Date Identifier Value;
cards;
011980 3424 2.4
021980 3543 4.3
031980 432  5.5
;
run;

proc sort data=have1 tagsort;
by Identifier;
run;

data have2;
input Date Identifier Value;by date;
cards;
1979 432  4.2
1980 4332 1.2
1981 423  2.2
1979 123  2.3
1980 127  2.4
1981 4237 2.5
;
run;

proc sort data=have2 tagsort;
by date;
run;

data have3 ;
do _n_=1 by 1 until (last.Date);
set have2;
by date;
date=cats(_n_,Date);
output;
end;
run;

proc sort data = have3 tagsort;
by Identifier;
run;

data want;
merge have1 (in=a)
   have3 (in=b);
by Identifier;
if a=b;
run;

Super Contributor
Posts: 349

Re: Merging data sets that have different range for 'by variable'

Hi Sharmas,

Try this..Hope it helps..

data have1(drop=date);

input Date $ Identifier Value;

date1 = mdy( substr(date,1,2),

           1, substr(date,3,4) );

format date1 monyy7.;

cards;

011980 3424 2.4

021980 3543 4.3

031980 432  5.5

121980 4332

011981 423

;

run;

proc sort data=have1 ;

by Identifier ;

run;

data have2(drop=date i);

input Date Identifier Value;

do i=1 to 12;

date1=intnx('month',input(compress('Jan'||date),monyy7.),i);

output;

end;

format date1 monyy7.;

cards;

1979 432  4.2

1980 4332 1.2

1981 423  2.2

;

run;

proc sort data=have2 ;by identifier ;run;

data want;

merge have1(in=a) have2(in=b);

by identifier ;

if b and a;

run;

Thanks,

Shiva

Ask a Question
Discussion stats
  • 2 replies
  • 264 views
  • 0 likes
  • 3 in conversation