Those values should have worked as 2001.01 looks like a valid number.
But if instead you had actual dates or a text variable as the first column it is not hard to add code to treat column one differently from the others.
Assuming that your data is actually comma delimited so that it looks something like this:
Date,MAR10000001,MAR10000002,MAR10000003,MAR10156052
2001.01,3,18,11,4535
2001.02,7,2,15,895
2001.03,9,0,0,0
2013.09,34,1,56,5646
Then to read that I would probably just go with something like:
data want ;
infile 'myfile.csv' lrecl=2000000 dsd truncover firstobs=2 ;
input date var1-var4 ;
run;
Then if you wanted to you could rename the variables or attache labels to them so that they match the values in row 1.
If you want to convert 2001 and two one-hundredths to a value that SAS would consider a date you could do
date = input(put(date,7.2)||'.01',yymmdd10.);
format date yymmp7. ;
Actually checking the data in Notepad.. it's continuous.. eg :
Date,MAR10000001,MAR10000002,MAR10000003,MAR10156052,2001.01,3,18,11,4535,2001.02,7,2,15,895,2001.03,9,0,0,0,
2013.09,34,1,56,5646
Notepad may not be the best choice to see the "exact" contents of a file.
I do not necessarily trust Notepad as it does not understand Unix files that use only LF as the end of line.
But assuming that your data file does NOT contain any line breaks then you will need to know how many variables there are.
Here is some example code that could read the names into one file and the data into another. You could then merge the two by the COL variable and use PROC TRANSPOSE to create an horizontal dataset. Note that I have added step to read in the DATE variable separate from the individual numeric variable values.
%let n=4 ;
data names(keep=col name)
cells(keep=row col date value)
;
dlm='0A'x||',' ;
infile tmpfile1 recfm=n dsd dlm=dlm;
length name $32 ;
length value 8 ;
retain row 0;
if row=0 then do;
input name ;
do col=1 to &n ; input name ; output names; end;
end;
row+1;
input date ;
do col=1 to &n ; input value ; output celss; end;
run;
I don't think PROC IMPORT honors LRECL, if that's what you're talking about.
Right - so if the proper LRECL is 99999 or something, PROC IMPORT will never work.
That's my impression, assuming my test is not flawed.. It did not used to be that way, maybe version 9.1.3 or 8.2 did not include that "hard coded" LRECL on the INFILE statement.
The following code doesn't do the date correction that Tom suggested, but does attempt to correct invalid variable names and assign the actual column headers as variable labels. Of course, it could be easily modified to deal with dates and character variables. However, I was more interested in simply expanding the basic method:
%let maxvar=158000 ;
%let fname='C:\art\Unit_Monthly.csv' ;
filename code2run temp;
data tall(keep=row name value) ;
array _name_ (&maxvar) $32 _temporary_ ;
array _rawnames_ (&maxvar) $255 _temporary_ ;
retain row 0 nvar 0 ;
file code2run;
infile &fname dsd truncover col=cc length=len lrecl=2000000 end=eof;
if _n_=1 then do col=1 by 1 until (cc > len);
length name label $255 ;
length outlabel $300;
input label @;
name=compress(translate(label,'_',' '),,'kn');
if anydigit(name) eq 1 then name=catt("_",substr(name,1));
if length(name) gt 32 then name=substr(name,1,32);
_rawnames_(col)=name;
_j_=0;
if col gt 1 then do;
do _i_=1 to col-1;
if name eq _rawnames_(_i_) then _j_+1;
end;
end;
if _j_ gt 0 then do;
if length(name) gt 30 then name=substr(name,1,30);
name=catt(name,put(_j_,z2.));
end;
_name_(col) = name ;
outlabel=catx(' ','label',name,'=', label,';');
put outlabel;
end;
else do col=1 by 1 until (cc > len);
input value @;
name = _name_(col);
output;
end;
nvar = max(nvar,col);
row+1;
if eof then putlog 'NOTE: ' row 'observations with ' nvar 'variables.' ;
run;
proc transpose data=tall out=wide (drop=_name_) ;
by row ;
id name ;
var value ;
run;
data want;
set wide;
%include code2run;
run;
Thanks Arthur, your code works fine, the data is just so big and so much that it takes too long to run.. I went back to the source provider and asked them to chop it up as per only the fields I needed.
Thanks to all for your help.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.