BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Tom
Super User Tom
Super User

Those values should have worked as 2001.01 looks like a valid number.

But if instead you had actual dates or a text variable as the first column it is not hard to add code to treat column one differently from the others.

Assuming that your data is actually comma delimited so that it looks something like this:

Date,MAR10000001,MAR10000002,MAR10000003,MAR10156052

2001.01,3,18,11,4535

2001.02,7,2,15,895

2001.03,9,0,0,0

2013.09,34,1,56,5646

Then to read that I would probably just go with something like:

data want ;

  infile 'myfile.csv' lrecl=2000000 dsd truncover firstobs=2 ;

  input date var1-var4  ;

run;


Then if you wanted to you could rename the variables or attache labels to them so that they match the values in row 1.

If you want to convert 2001 and two one-hundredths to a value that SAS would consider a date you could do

date = input(put(date,7.2)||'.01',yymmdd10.);

format date yymmp7. ;

podarum
Quartz | Level 8

Actually checking the data in Notepad.. it's continuous.. eg  :

Date,MAR10000001,MAR10000002,MAR10000003,MAR10156052,2001.01,3,18,11,4535,2001.02,7,2,15,895,2001.03,9,0,0,0,

2013.09,34,1,56,5646

data_null__
Jade | Level 19

Notepad may not be the best choice to see the "exact" contents of a file.

Tom
Super User Tom
Super User

I do not necessarily trust Notepad as it does not understand Unix files that use only LF as the end of line.

But assuming that your data file does NOT contain any line breaks then you will need to know how many variables there are.

Here is some example code that could read the names into one file and the data into another. You could then merge the two by the COL variable and use PROC TRANSPOSE to create an horizontal dataset.  Note that I have added step to read in the DATE variable separate from the individual numeric variable values.

%let n=4 ;

data names(keep=col name)

     cells(keep=row col date value)

;

  dlm='0A'x||',' ;

  infile tmpfile1 recfm=n dsd dlm=dlm;

  length name $32 ;

  length value 8 ;

  retain row 0;

  if row=0 then do;

     input name ;

     do col=1 to &n ; input name ; output names; end;

  end;

  row+1;

  input date ;

  do col=1 to &n ; input value ; output celss; end;

run;

data_null__
Jade | Level 19

I don't think PROC IMPORT honors LRECL, if that's what you're talking about.  

18         filename ft15f001 temp lrecl=42767;
19         proc import file=ft15f001 dbms=csv out=tempcsv replace;
20            parmcards;
21         a,b,c
22         1,2,3
23            ;;;;

24          /**********************************************************************
25          *   PRODUCT:   SAS
26          *   VERSION:   9.3
27          *   CREATOR:   External File Interface
28          *   DATE:      20NOV13
29          *   DESC:      Generated SAS Datastep Code
30          *   TEMPLATE SOURCE:  (None Specified.)
31          ***********************************************************************/

32             data WORK.TEMPCSV    ;
33             %let _EFIERR_ = 0; /* set the ERROR detection macro variable */
34             infile FT15F001 delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=2 ;
35                informat a best32. ;
36                informat b best32. ;
37                informat c best32. ;
38                format a best12. ;
39                format b best12. ;
40                format c best12. ;
41             input
42                         a
43                         b
44                         c
45             ;
46             if _ERROR_ then call symputx('_EFIERR_',1);  /* set ERROR detection macro variable */
47             run;
snoopy369
Barite | Level 11

Right - so if the proper LRECL is 99999 or something, PROC IMPORT will never work.

data_null__
Jade | Level 19

That's my impression, assuming my test is not flawed..  It did not used to be that way, maybe version 9.1.3 or 8.2 did not include that "hard coded" LRECL on the INFILE statement.

art297
Opal | Level 21

The following code doesn't do the date correction that Tom suggested, but does attempt to correct invalid variable names and assign the actual column headers as variable labels.  Of course, it could be easily modified to deal with dates and character variables.  However, I was more interested in simply expanding the basic method:

%let maxvar=158000 ;

%let fname='C:\art\Unit_Monthly.csv' ;

filename code2run temp;

data tall(keep=row name value) ;

  array _name_ (&maxvar) $32 _temporary_ ;

  array _rawnames_ (&maxvar) $255 _temporary_ ;

  retain row 0 nvar 0 ;

  file code2run;

  infile &fname dsd truncover col=cc length=len lrecl=2000000 end=eof;

  if _n_=1 then do col=1 by 1 until (cc > len);

     length name label $255 ;

     length outlabel $300;

     input label @;

     name=compress(translate(label,'_',' '),,'kn');

     if anydigit(name) eq 1 then name=catt("_",substr(name,1));

     if length(name) gt 32 then name=substr(name,1,32);

     _rawnames_(col)=name;

     _j_=0;

     if col gt 1 then do;

       do _i_=1 to col-1;

         if name eq _rawnames_(_i_) then _j_+1;

       end;

     end;

     if _j_ gt 0 then do;

       if length(name) gt 30 then name=substr(name,1,30);

       name=catt(name,put(_j_,z2.));

     end;

     _name_(col) = name ;

     outlabel=catx(' ','label',name,'=', label,';');

     put outlabel;

  end;

  else do col=1 by 1 until (cc > len);

    input value @;

    name = _name_(col);

    output;

  end;

  nvar = max(nvar,col);

  row+1;

  if eof then putlog 'NOTE: ' row 'observations with ' nvar 'variables.' ;

run;

proc transpose data=tall out=wide (drop=_name_) ;

  by row ;

  id name ;

  var value ;

run;

data want;

  set wide;

  %include code2run;

run;

podarum
Quartz | Level 8

Thanks Arthur, your code works fine, the data is just so big and so much that it takes too long to run.. I went back to the source provider and asked them to chop it up as per only the fields I needed.

Thanks to all for your help.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 23 replies
  • 1709 views
  • 3 likes
  • 6 in conversation