BookmarkSubscribeRSS Feed
yotsuba88
Quartz | Level 8

Hi,

 

I am trying to import data from txt file or csv as follow.

Date Mkt-RF SMB HML RF
19260701 0.10 -0.24 -0.28 0.009
19260702 0.45 -0.32 -0.08 0.009
19260706 0.17 0.27 -0.35 0.009
19260707 0.09 -0.59 0.03 0.009
19260708 0.21 -0.36 0.15 0.009
19260709 -0.71 0.44 0.56 0.009

 

My code I tried is invalid data with date, even I try different formats for date but it can not run. And I am not sure for the format of mkt-rf, smb, hml, rf are right or not, they should be numeric as interest rate. 

Please help. Thank you. 

 

data fff ;
%let _EFIERR_ = 0; /* set the ERROR detection macro variable */
infile 'C:\Sirimon\ha\Data\F-F_Research_Data_Factors_daily_csv\F-F_Research_Data_Factors_daily.csv' delimiter=';' MISSOVER DSD lrecl=32767 firstobs=6 ;


informat Date yymmdd8.
mkt_rf $12.
smb $12.
rf $12. ;


format Date yymmdd8.
mkt_rf $12.
smb $12.
rf $12.;


input
Date
mkt_rf
smb
rf
;
if _ERROR_ then call symputx('_EFIERR_',1); /* set ERROR detection macro variable */
run;

 

4 REPLIES 4
art297
Opal | Level 21

Your data, as shown, didn't have any delimiter other than a space.

 

Your infile statement indicated that the first data line was line 6 but, your example, had it as line 2.

 

Your input statement only inputted 4 variables, while you had 5. And, other than the dates, you were inputting numbers as characters.

 

I presume you want something like:

data fff ;
  infile 'C:\art\F-F_Research_Data_Factors_daily.txt' truncover lrecl=32767 firstobs=2 ;
  informat Date yymmdd8.;
  format Date yymmdd10.;
  input Date mkt_rf smb hml rf;
run;  

Art, CEO, AnalystFinder.com

 

Shmuel
Garnet | Level 18

1) put the %LET statement out in before the data step

2) You defined INFORMAT of some variables longer then the real informat

     e.g. informat of mkt_rf $12. while text is probably 0.10 - just length of 4

3) You defined delimiter = ';' while given input has no semi-colon at all

4) Did you got invalid date from line 2 on ?

   

Try next code:

%let _EFIERR_ = 0; /* set the ERROR detection macro variable */

data fff ;
infile 'C:\Sirimon\ha\Data\F-F_Research_Data_Factors_daily_csv\F-F_Research_Data_Factors_daily.csv' delimiter=';' truncover DSD lrecl=32767 firstobs=6 ;

format Date yymmdd8.
mkt_rf $12.
smb $12.
rf $12.;

input
Date yymmdd8.
mkt_rf
smb
rf
;

Do you still get INVALID DATE ?

 

Please post the input (few lines) as you realy heve it and post your full log. 

mkeintz
PROC Star

You are tryiing to get Fama-French factors into a SAS data set.  I see a number of issues:

 

  1. You name the file as if it is comma-separated, but the data are actually space separated and your DLM= option specifies a ':'.
  2. You are reading the factor values as character, they should be numeric
  3. You have "FIRSTOBS=6". Why?  It looks like you should have FIRSTOBS=2

Consider this:

data fff ;
%let _EFIERR_ = 0; /* set the ERROR detection macro variable */
infile ''C:\Sirimon\ha\Data\F-F_Research_Data_Factors_daily_csv\F-F_Research_Data_Factors_daily.csv' 
delimiter=' ' MISSOVER DSD lrecl=32767 firstobs=2; informat Date yymmdd8. mkt_rf 12. smb 12. rf 12. ; format Date yymmdd8. mkt_rf 12.4 smb 12.4 rf 12.4; input Date mkt_rf smb rf ; if _ERROR_ then call symputx('_EFIERR_',1); /* set ERROR detection macro variable */ run;

 

I presume you are getting the data from Ken French's web site, so there should be no missing values.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Kurt_Bremser
Super User

This will read your data OK.

data want;
infile cards firstobs=2;
input Date :yymmdd8. Mkt_RF SMB HML RF;
format Date yymmddn8.;
cards;
Date Mkt-RF SMB HML RF
19260701 0.10 -0.24 -0.28 0.009
19260702 0.45 -0.32 -0.08 0.009
19260706 0.17 0.27 -0.35 0.009
19260707 0.09 -0.59 0.03 0.009
19260708 0.21 -0.36 0.15 0.009
19260709 -0.71 0.44 0.56 0.009
;
run;

Assuming that the data is as posted (I did a copy-paste to my EG program window). In the future use the {i} button to post log or data text, as it will preserve all characters and the formatting.

Note that I copied the variable names from the first line, with the only change being the replacement of the dash by an underline.

Think simple; your code is too complicated. Only add options when needed.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 4303 views
  • 0 likes
  • 5 in conversation