10-27-2017 12:03 AM
I've tried regular merging and this proc sql but my observations end up getting deleted. The datasets have the same columns except for 1 but I want it all to be in one data set. Thanks!
create table compustat_1 as
from transform a, transform2 b
where a.gvkey=b.gvkey and
a.datadate=b.datadate and a.fyear=b.fyear and a.cusip=b.cusip and a.conm=b.conm
and a.fyr=b.fyr and a.CIKold=b.CIKold and a.cik=b.cik and a.ni=b.ni and a.revt=b.revt and
a.SICCODE=b.SICCODE and a.count=b.count and a.log_count=b.log_count and a.Size=b.Size and a.ROA-b.ROA;
10-27-2017 12:07 AM
1. Is there any errors in the log - post the full log.
2. Check your types/formats on the variables, does it match for all your key variables
3. You're doing a cross join and then filtering. Try specifying the join type instead (full/left/right) and using an ON clause instead of a WHERE.
4. Show your data step version and the log, I find that better and showing errors.
10-28-2017 10:25 PM
***on**** didnt work either
193 proc sql;
194 create table compustat_1 as
195 select *
196 from transform a, transform2 b
197 on a.gvkey=b.gvkey and
ERROR 22-322: Syntax error, expecting one of the following: ;, ',', ANSIMISS, CROSS, EXCEPT, FULL,
GROUP, HAVING, INNER, INTERSECT, JOIN, LEFT, NATURAL, NOMISS, ORDER, OUTER, RIGHT,
ERROR 76-322: Syntax error, statement will be ignored.
198 a.datadate=b.datadate and a.fyear=b.fyear and a.cusip=b.cusip and a.conm=b.conm
199 and a.fyr=b.fyr and a.CIKold=b.CIKold and a.cik=b.cik and a.ni=b.ni and a.revt=b.revt and
200 a.SICCODE=b.SICCODE and a.count=b.count and a.log_count=b.log_count and a.Size=b.Size and
10-28-2017 10:23 PM
That didn't work.
188 proc sql;
189 create table compustat_1 as
190 select *
191 from transform natural join transform2;
NOTE: Table WORK.COMPUSTAT_1 created, with 0 rows and 18 columns.
10-28-2017 11:24 PM
This likely means that at least one of the variables that the datasets have in common doesn't have the same value in both datasets. You could work around that problem by renaming the variable in one of the datasets.
proc sql; create table compustat_1 as select * from transform natural join transform2(rename=z=z2);
that way the natural join will not try to match variables z from both datasets.
10-29-2017 12:20 AM
10-29-2017 01:44 AM
Are you sure you want to merge and not concatenate the two datasets?
data compustat_1; set transform transform2; run;
What is the variable that is different? Is it only one variable in one dataset? Or one variable in each dataset? If in both does it have different names in both datasets? If it does have the same name in both datasets then which value do you want to appear in the result?
If you are merging do the only want observations that are in both? Or observations that are either?
What are your actual key variables that uniquely identify the observations in each file? What is the purpose of the other variables?
Or do you have multiple observations per key value? If so is it multiple observations in both input datasets? Or only in one?