I imported a CSV file that I am now trying to merge with a Compustat file. I have checked to make sure the matching variables are both numeric or both character, however I have an issue with the date variable. Here is my coding:
*Sort datasets before merging;
proc sort data=Co_industry; by gvkey datadate;
run;
Proc sort data=MAscoref; by gvkey2 fiscal_year_end;
run;
proc sql;
create table fundascore2 as select
a.*, b.ma_score_2015
from co_industry a left join mascoref b
on a.gvkey=b.gvkey2 and a.datadate=b.fiscal_year_end
and not missing (b.Ma_score_2015);
quit;
The issue I'm having is that the variable "ma_score_2015" is just showing as a "."
I think it has something to do with "a.datadate=b.fiscal_year_end".
When I don't include "a.datadate=b.fiscal_year_end", the numbers for "ma_score_2015" show up but they do not
correspond to the correct date. I also suspect it may be that "fiscal_year_end" is not formatted
correctly since it was an import from a CSV file. HELP! 🙂
The reason they are not matching is that FISCAL_YEAR_END is NOT a date. It is just a number that your brain is tricking you into thinking looks like a date. So you are seeing 19841231 and thinking it means 1984-12-31 when it really means 19,841,231.
So fix that table first or just change your condition.
datadate = input(put(FISCAL_YEAR_END,8.),yymmdd8.)
Can you post a couple of records from each file, as well as the output of proc contents runs on both files? My initial guess is that the Compustat Fiscal Year end may just be the 4 digit year.
Art, CEO, AnalystFinder.com
Here's the requested info.
b.Mascoref (proc contents)
Alphabetic List of Variables and Attributes | |||||
# | Variable | Type | Len | Format | Informat |
1 | FISCAL_YEAR_END | Num | 8 | BEST12. | BEST32. |
2 | MA_SCORE_2015 | Num | 8 | BEST12. | BEST32. |
3 | gvkey2 | Char | 6 |
|
|
Records
Obs | FISCAL_YEAR_END | MA_SCORE_2015 | gvkey2 |
1 | 19841231 | 0.1674489 | 001001 |
2 | 19851231 | 0.0528454 | 001001 |
3 | 19831231 | 0.0487767 | 001003 |
a.Co_industry (proc contents)
1 | gvkey | Char | 6 | $6. | $6. | Global Company Key |
2 | datadate | Num | 8 | YYMMDDN8. |
| Data Date |
Records
Obs | gvkey | datadate | at |
1 | 001004 | 19780531 | 56.5020 |
2 | 001004 | 19790531 | 64.1940 |
3 | 001004 | 19800531 | 73.7580 |
The merge with a.gvkey=b.gvkey2 works on the variable ma_score_2015, but the dates don't match.
When I include or just use a.datadate=b.fiscal_year_end the merged ma_score_2015 variable is all blank or rather just a "."
I've tried formatting fiscal_year_end to match the "format" and "informat" of datadate but that doesn't work either.
Thanks!
The reason they are not matching is that FISCAL_YEAR_END is NOT a date. It is just a number that your brain is tricking you into thinking looks like a date. So you are seeing 19841231 and thinking it means 1984-12-31 when it really means 19,841,231.
So fix that table first or just change your condition.
datadate = input(put(FISCAL_YEAR_END,8.),yymmdd8.)
That worked! Thanks!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.