Hi,
I am trying to import data from txt file or csv as follow.
Date Mkt-RF SMB HML RF
19260701 0.10 -0.24 -0.28 0.009
19260702 0.45 -0.32 -0.08 0.009
19260706 0.17 0.27 -0.35 0.009
19260707 0.09 -0.59 0.03 0.009
19260708 0.21 -0.36 0.15 0.009
19260709 -0.71 0.44 0.56 0.009
My code I tried is invalid data with date, even I try different formats for date but it can not run. And I am not sure for the format of mkt-rf, smb, hml, rf are right or not, they should be numeric as interest rate.
Please help. Thank you.
data fff ;
%let _EFIERR_ = 0; /* set the ERROR detection macro variable */
infile 'C:\Sirimon\ha\Data\F-F_Research_Data_Factors_daily_csv\F-F_Research_Data_Factors_daily.csv' delimiter=';' MISSOVER DSD lrecl=32767 firstobs=6 ;
informat Date yymmdd8.
mkt_rf $12.
smb $12.
rf $12. ;
format Date yymmdd8.
mkt_rf $12.
smb $12.
rf $12.;
input
Date
mkt_rf
smb
rf
;
if _ERROR_ then call symputx('_EFIERR_',1); /* set ERROR detection macro variable */
run;
Your data, as shown, didn't have any delimiter other than a space.
Your infile statement indicated that the first data line was line 6 but, your example, had it as line 2.
Your input statement only inputted 4 variables, while you had 5. And, other than the dates, you were inputting numbers as characters.
I presume you want something like:
data fff ; infile 'C:\art\F-F_Research_Data_Factors_daily.txt' truncover lrecl=32767 firstobs=2 ; informat Date yymmdd8.; format Date yymmdd10.; input Date mkt_rf smb hml rf; run;
Art, CEO, AnalystFinder.com
1) put the %LET statement out in before the data step
2) You defined INFORMAT of some variables longer then the real informat
e.g. informat of mkt_rf $12. while text is probably 0.10 - just length of 4
3) You defined delimiter = ';' while given input has no semi-colon at all
4) Did you got invalid date from line 2 on ?
Try next code:
%let _EFIERR_ = 0; /* set the ERROR detection macro variable */ data fff ; infile 'C:\Sirimon\ha\Data\F-F_Research_Data_Factors_daily_csv\F-F_Research_Data_Factors_daily.csv' delimiter=';' truncover DSD lrecl=32767 firstobs=6 ; format Date yymmdd8. mkt_rf $12. smb $12. rf $12.; input Date yymmdd8. mkt_rf smb rf ;
Do you still get INVALID DATE ?
Please post the input (few lines) as you realy heve it and post your full log.
You are tryiing to get Fama-French factors into a SAS data set. I see a number of issues:
Consider this:
data fff ;
%let _EFIERR_ = 0; /* set the ERROR detection macro variable */
infile ''C:\Sirimon\ha\Data\F-F_Research_Data_Factors_daily_csv\F-F_Research_Data_Factors_daily.csv'
delimiter=' ' MISSOVER DSD lrecl=32767 firstobs=2;
informat Date yymmdd8.
mkt_rf 12.
smb 12.
rf 12. ;
format Date yymmdd8.
mkt_rf 12.4
smb 12.4
rf 12.4;
input Date mkt_rf smb rf ;
if _ERROR_ then call symputx('_EFIERR_',1); /* set ERROR detection macro variable */
run;
I presume you are getting the data from Ken French's web site, so there should be no missing values.
This will read your data OK.
data want;
infile cards firstobs=2;
input Date :yymmdd8. Mkt_RF SMB HML RF;
format Date yymmddn8.;
cards;
Date Mkt-RF SMB HML RF
19260701 0.10 -0.24 -0.28 0.009
19260702 0.45 -0.32 -0.08 0.009
19260706 0.17 0.27 -0.35 0.009
19260707 0.09 -0.59 0.03 0.009
19260708 0.21 -0.36 0.15 0.009
19260709 -0.71 0.44 0.56 0.009
;
run;
Assuming that the data is as posted (I did a copy-paste to my EG program window). In the future use the {i} button to post log or data text, as it will preserve all characters and the formatting.
Note that I copied the variable names from the first line, with the only change being the replacement of the dash by an underline.
Think simple; your code is too complicated. Only add options when needed.
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.