DATA Step, Macro, Functions and more

Input data with Invalid data error

Reply
Contributor
Posts: 40

Input data with Invalid data error

[ Edited ]

Hi,

 

I am trying to import data from txt file or csv as follow.

Date Mkt-RF SMB HML RF
19260701 0.10 -0.24 -0.28 0.009
19260702 0.45 -0.32 -0.08 0.009
19260706 0.17 0.27 -0.35 0.009
19260707 0.09 -0.59 0.03 0.009
19260708 0.21 -0.36 0.15 0.009
19260709 -0.71 0.44 0.56 0.009

 

My code I tried is invalid data with date, even I try different formats for date but it can not run. And I am not sure for the format of mkt-rf, smb, hml, rf are right or not, they should be numeric as interest rate. 

Please help. Thank you. 

 

data fff ;
%let _EFIERR_ = 0; /* set the ERROR detection macro variable */
infile 'C:\Sirimon\ha\Data\F-F_Research_Data_Factors_daily_csv\F-F_Research_Data_Factors_daily.csv' delimiter=';' MISSOVER DSD lrecl=32767 firstobs=6 ;


informat Date yymmdd8.
mkt_rf $12.
smb $12.
rf $12. ;


format Date yymmdd8.
mkt_rf $12.
smb $12.
rf $12.;


input
Date
mkt_rf
smb
rf
;
if _ERROR_ then call symputx('_EFIERR_',1); /* set ERROR detection macro variable */
run;

 

PROC Star
Posts: 7,492

Re: Input data with Invalid data error

Posted in reply to yotsuba88

Your data, as shown, didn't have any delimiter other than a space.

 

Your infile statement indicated that the first data line was line 6 but, your example, had it as line 2.

 

Your input statement only inputted 4 variables, while you had 5. And, other than the dates, you were inputting numbers as characters.

 

I presume you want something like:

data fff ;
  infile 'C:\art\F-F_Research_Data_Factors_daily.txt' truncover lrecl=32767 firstobs=2 ;
  informat Date yymmdd8.;
  format Date yymmdd10.;
  input Date mkt_rf smb hml rf;
run;  

Art, CEO, AnalystFinder.com

 

Trusted Advisor
Posts: 1,586

Re: Input data with Invalid data error

Posted in reply to yotsuba88

1) put the %LET statement out in before the data step

2) You defined INFORMAT of some variables longer then the real informat

     e.g. informat of mkt_rf $12. while text is probably 0.10 - just length of 4

3) You defined delimiter = ';' while given input has no semi-colon at all

4) Did you got invalid date from line 2 on ?

   

Try next code:

%let _EFIERR_ = 0; /* set the ERROR detection macro variable */

data fff ;
infile 'C:\Sirimon\ha\Data\F-F_Research_Data_Factors_daily_csv\F-F_Research_Data_Factors_daily.csv' delimiter=';' truncover DSD lrecl=32767 firstobs=6 ;

format Date yymmdd8.
mkt_rf $12.
smb $12.
rf $12.;

input
Date yymmdd8.
mkt_rf
smb
rf
;

Do you still get INVALID DATE ?

 

Please post the input (few lines) as you realy heve it and post your full log. 

Trusted Advisor
Posts: 1,022

Re: Input data with Invalid data error

Posted in reply to yotsuba88

You are tryiing to get Fama-French factors into a SAS data set.  I see a number of issues:

 

  1. You name the file as if it is comma-separated, but the data are actually space separated and your DLM= option specifies a ':'.
  2. You are reading the factor values as character, they should be numeric
  3. You have "FIRSTOBS=6". Why?  It looks like you should have FIRSTOBS=2

Consider this:

data fff ;
%let _EFIERR_ = 0; /* set the ERROR detection macro variable */
infile ''C:\Sirimon\ha\Data\F-F_Research_Data_Factors_daily_csv\F-F_Research_Data_Factors_daily.csv' 
delimiter=' ' MISSOVER DSD lrecl=32767 firstobs=2; informat Date yymmdd8. mkt_rf 12. smb 12. rf 12. ; format Date yymmdd8. mkt_rf 12.4 smb 12.4 rf 12.4; input Date mkt_rf smb rf ; if _ERROR_ then call symputx('_EFIERR_',1); /* set ERROR detection macro variable */ run;

 

I presume you are getting the data from Ken French's web site, so there should be no missing values.

Super User
Posts: 7,863

Re: Input data with Invalid data error

Posted in reply to yotsuba88

This will read your data OK.

data want;
infile cards firstobs=2;
input Date :yymmdd8. Mkt_RF SMB HML RF;
format Date yymmddn8.;
cards;
Date Mkt-RF SMB HML RF
19260701 0.10 -0.24 -0.28 0.009
19260702 0.45 -0.32 -0.08 0.009
19260706 0.17 0.27 -0.35 0.009
19260707 0.09 -0.59 0.03 0.009
19260708 0.21 -0.36 0.15 0.009
19260709 -0.71 0.44 0.56 0.009
;
run;

Assuming that the data is as posted (I did a copy-paste to my EG program window). In the future use the {i} button to post log or data text, as it will preserve all characters and the formatting.

Note that I copied the variable names from the first line, with the only change being the replacement of the dash by an underline.

Think simple; your code is too complicated. Only add options when needed.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Ask a Question
Discussion stats
  • 4 replies
  • 179 views
  • 0 likes
  • 5 in conversation