Hi, I have two dataset, one with quarterly data and the other will monthly data. I want to use proc sql to merge them together. Specifically, A(quarterly data) has firmid, month and year. B(monthly data) has firmid, month, year and return. I wrote the following code, but the output has missing data for all variables except for return from B. How to do this?
proc sql;
create table raw as
select *
from A full join B
on A.firmid=B.firmid & A.year=B.year & A.month=B.month;
quit;
I need to use full join since I want all the monthly returns. A only has quarterly data for month and year.
some sample data of the data one, data two and the data you want would help understanding your question.
Yes, you are right. Here are some hypothetical sample data.
A:
firmid year month vol
1 1990 03 10
1 1990 06 20
1 1990 09 15
2 1990 03 18
...
B:
firmid year month return
1 1990 01 0.2
1 1990 02 0.2
1 1990 03 0.5
2 1990 01 0.1
...
What I want
raw:
firmid year month return vol
1 1990 01 0.2 .
1 1990 02 0.2 .
1 1990 03 0.5 10
2 1990 01 0.1 .
...
That' because you name A first in the from, which means that columns from a is selected over columns for table B (if name match). You should have a note in the log telling you this.
When using full joins you (always) need to use the coalesce() function on common named columns, especially columns in the join criteria.
Hi SeanZ
Please Try this code (Not using Proc sql)
Data A;
input firmid year month vol;
cards;
1 1990 03 10
1 1990 06 20
1 1990 09 15
2 1990 03 18
;
run;
Data B;
input firmid year month return;
cards;
1 1990 01 0.2
1 1990 02 0.2
1 1990 03 0.5
2 1990 01 0.1
;
run;
proc sort data=A ;
by firmid year month;
run;
proc sort data=B ;
by firmid year month;
run;
data AandB;
merge A(in=a) B(in=b);
by firmid year month;
if b;
if return=. or vol=. then delete;
run;
/*Keep the months in A not AandB*/
proc sort data= AandB;
by firmid year month return;
run;
proc sort data=B ;
by firmid year month return;
run;
/*Find the common months between monthly and quarterly data*/
data A_1;
merge B (in=a) AandB (in=b);
by firmid year month return;
if a and not b;
run;
/*append with the monthly data*/
data want;
set A_1 AandB;
run;
proc sort data=want;
by firmid year month ;
run;
data set1;
input firmid year month vol;
datalines;
1 1990 03 10
1 1990 06 20
1 1990 09 15
2 1990 03 18
;
run;
data set2;
input firmid year month return;
datalines;
1 1990 01 0.2
1 1990 02 0.2
1 1990 03 0.5
2 1990 01 0.1
;
run;
proc sql;
select a.*, b.vol
from
set2 a
left join set1 b
on a.firmid=b.firmid and a.year=b.year and a.month=b.month
;
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.