BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Yegen
Pyrite | Level 9

I am trying to import files that start with the same name: Combined (e.g., Combined1.csv, Combined2.csv, Combined3.csv). I have written a macro that gives an error. I think the error is occuring because of the delimiter. However, without the delimiter, the imported files do not have observations in it. The number and name of the columns is identical, but the number of rows/records is different. 

Here is my macro:

%Macro importing;
%Do i=1 %To 14;

data WORK.malm&i;
infile "C:\Users\yegen\Desktop\Malmqvist Files\Malmqvist Files\combined&i..csv"
! delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=2 ;
 informat DMUj0 $10. ;
   informat EffPeriod1 best32. ;
   informat EffPeriod2 best32. ;
    informat EffChange best32. ;
       informat TechChange best32. ;
       informat MalmIndex best32. ;
          informat group best32. ;
          informat industry best32. ;
          format DMUj0 $10. ;
          format EffPeriod1 best12. ;
          format EffPeriod2 best12. ;
          format EffChange best12. ;
         format TechChange best12. ;
          format MalmIndex best12. ;
          format group best12. ;
          format industry best12. ;
       input
                   DMUj0 $
                   EffPeriod1
                   EffPeriod2
                   EffChange
                   TechChange
                   MalmIndex
                   group
                   industry
;
  run;

  %end;
%Mend;

%importing;



Here is the log file:


23: LINE and COLUMN cannot be determined.
NOTE 242-205: NOSPOOL is on. Rerunning with OPTION SPOOL might allow recovery of the LINE and
              COLUMN where the error has occurred.
ERROR 23-2: Invalid option name !.

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.MALM1 may be incomplete.  When this step was stopped there were 0
         observations and 8 variables.
WARNING: Data set WORK.MALM1 was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


23: LINE and COLUMN cannot be determined.
NOTE 242-205: NOSPOOL is on. Rerunning with OPTION SPOOL might allow recovery of the LINE and
              COLUMN where the error has occurred.
ERROR 23-2: Invalid option name !.

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.MALM2 may be incomplete.  When this step was stopped there were 0
         observations and 8 variables.
WARNING: Data set WORK.MALM2 was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


23: LINE and COLUMN cannot be determined.
NOTE 242-205: NOSPOOL is on. Rerunning with OPTION SPOOL might allow recovery of the LINE and
              COLUMN where the error has occurred.
ERROR 23-2: Invalid option name !.

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.MALM3 may be incomplete.  When this step was stopped there were 0
         observations and 8 variables.
WARNING: Data set WORK.MALM3 was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


23: LINE and COLUMN cannot be determined.
NOTE 242-205: NOSPOOL is on. Rerunning with OPTION SPOOL might allow recovery of the LINE and
              COLUMN where the error has occurred.
ERROR 23-2: Invalid option name !.

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.MALM4 may be incomplete.  When this step was stopped there were 0
         observations and 8 variables.
WARNING: Data set WORK.MALM4 was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds


23: LINE and COLUMN cannot be determined.
NOTE 242-205: NOSPOOL is on. Rerunning with OPTION SPOOL might allow recovery of the LINE and
              COLUMN where the error has occurred.
ERROR 23-2: Invalid option name !.

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.MALM5 may be incomplete.  When this step was stopped there were 0
         observations and 8 variables.
WARNING: Data set WORK.MALM5 was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds


23: LINE and COLUMN cannot be determined.
NOTE 242-205: NOSPOOL is on. Rerunning with OPTION SPOOL might allow recovery of the LINE and
              COLUMN where the error has occurred.
ERROR 23-2: Invalid option name !.

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.MALM6 may be incomplete.  When this step was stopped there were 0
         observations and 8 variables.
WARNING: Data set WORK.MALM6 was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


23: LINE and COLUMN cannot be determined.
NOTE 242-205: NOSPOOL is on. Rerunning with OPTION SPOOL might allow recovery of the LINE and
              COLUMN where the error has occurred.
ERROR 23-2: Invalid option name !.

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.MALM7 may be incomplete.  When this step was stopped there were 0
         observations and 8 variables.
WARNING: Data set WORK.MALM7 was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


23: LINE and COLUMN cannot be determined.
NOTE 242-205: NOSPOOL is on. Rerunning with OPTION SPOOL might allow recovery of the LINE and
              COLUMN where the error has occurred.
ERROR 23-2: Invalid option name !.

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.MALM8 may be incomplete.  When this step was stopped there were 0
         observations and 8 variables.
WARNING: Data set WORK.MALM8 was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds


23: LINE and COLUMN cannot be determined.
NOTE 242-205: NOSPOOL is on. Rerunning with OPTION SPOOL might allow recovery of the LINE and
              COLUMN where the error has occurred.
ERROR 23-2: Invalid option name !.

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.MALM9 may be incomplete.  When this step was stopped there were 0
         observations and 8 variables.
WARNING: Data set WORK.MALM9 was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


23: LINE and COLUMN cannot be determined.
NOTE 242-205: NOSPOOL is on. Rerunning with OPTION SPOOL might allow recovery of the LINE and
              COLUMN where the error has occurred.
ERROR 23-2: Invalid option name !.

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.MALM10 may be incomplete.  When this step was stopped there were 0
         observations and 8 variables.
WARNING: Data set WORK.MALM10 was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


23: LINE and COLUMN cannot be determined.
NOTE 242-205: NOSPOOL is on. Rerunning with OPTION SPOOL might allow recovery of the LINE and
              COLUMN where the error has occurred.
ERROR 23-2: Invalid option name !.

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.MALM11 may be incomplete.  When this step was stopped there were 0
         observations and 8 variables.
WARNING: Data set WORK.MALM11 was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


23: LINE and COLUMN cannot be determined.
NOTE 242-205: NOSPOOL is on. Rerunning with OPTION SPOOL might allow recovery of the LINE and
              COLUMN where the error has occurred.
ERROR 23-2: Invalid option name !.

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.MALM12 may be incomplete.  When this step was stopped there were 0
         observations and 8 variables.
WARNING: Data set WORK.MALM12 was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds


23: LINE and COLUMN cannot be determined.
NOTE 242-205: NOSPOOL is on. Rerunning with OPTION SPOOL might allow recovery of the LINE and
              COLUMN where the error has occurred.
ERROR 23-2: Invalid option name !.

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.MALM13 may be incomplete.  When this step was stopped there were 0
         observations and 8 variables.
WARNING: Data set WORK.MALM13 was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


23: LINE and COLUMN cannot be determined.
NOTE 242-205: NOSPOOL is on. Rerunning with OPTION SPOOL might allow recovery of the LINE and
              COLUMN where the error has occurred.
ERROR 23-2: Invalid option name !.

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.MALM14 may be incomplete.  When this step was stopped there were 0
         observations and 8 variables.
WARNING: Data set WORK.MALM14 was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

 


When I manually import the data (e.g., Malm2), the import works well and here is the log file:


 data WORK.example    ;
198      %let _EFIERR_ = 0; /* set the ERROR detection macro variable */
199      infile 'C:\Users\yegen\Desktop\Malmqvist Files\Malmqvist Files\combined2.csv'
199! delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=2 ;
200         informat DMUj0 $10. ;
201         informat EffPeriod1 best32. ;
202         informat EffPeriod2 best32. ;
203         informat EffChange best32. ;
204         informat TechChange best32. ;
205         informat MalmIndex best32. ;
206         informat group best32. ;
207         informat industry best32. ;
208         format DMUj0 $10. ;
209         format EffPeriod1 best12. ;
210         format EffPeriod2 best12. ;
211         format EffChange best12. ;
212         format TechChange best12. ;
213         format MalmIndex best12. ;
214         format group best12. ;
215         format industry best12. ;
216      input
217                  DMUj0 $
218                  EffPeriod1
219                  EffPeriod2
220                  EffChange
221                  TechChange
222                  MalmIndex
223                  group
224                  industry
225      ;
226      if _ERROR_ then call symputx('_EFIERR_',1);  /* set ERROR detection macro variable */
227      run;

NOTE: The infile 'C:\Users\eyub.yegen15\Desktop\Malmqvist Files\Malmqvist Files\combined2.csv' is:
      Filename=C:\Users\eyub.yegen15\Desktop\Malmqvist Files\Malmqvist Files\combined2.csv,
      RECFM=V,LRECL=32767,File Size (bytes)=515848,
      Last Modified=11Jul2017:04:20:28,
      Create Time=11Jul2017:12:24:50

NOTE: 9604 records were read from the infile 'C:\Users\eyub.yegen15\Desktop\Malmqvist
      Files\Malmqvist Files\combined2.csv'.
      The minimum record length was 25.
      The maximum record length was 85.
NOTE: The data set WORK.EXAMPLE has 9604 observations and 8 variables.
NOTE: DATA statement used (Total process time):
      real time           0.02 seconds
      cpu time            0.01 seconds


9604 rows created in WORK.example from C:\Users\eyub.yegen15\Desktop\Malmqvist Files\Malmqvist
Files\combined2.csv.



NOTE: WORK.EXAMPLE data set was successfully created.
NOTE: The data set WORK.EXAMPLE has 9604 observations and 8 variables.

 

1 ACCEPTED SOLUTION

Accepted Solutions
Shmuel
Garnet | Level 18

1)  you can save typing by change:

  

informat var1 fmt1.;
informat var2 fmt2.;
...

into:

informat var1 fmt1.  var2 fmt2.;

and if fmt1 is same as fmt2 it can be written as:

 

informat var1 var2 fmt.;

 

2)  Your error tells:  Invalid option name !.

     Why do you added the ! (exclamation mark) in your code:

   

infile "C:\Users\yegen\Desktop\Malmqvist Files\Malmqvist Files\combined&i..csv"
! delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=2 ;

 

 

 

 

 

 

View solution in original post

13 REPLIES 13
Shmuel
Garnet | Level 18

1)  you can save typing by change:

  

informat var1 fmt1.;
informat var2 fmt2.;
...

into:

informat var1 fmt1.  var2 fmt2.;

and if fmt1 is same as fmt2 it can be written as:

 

informat var1 var2 fmt.;

 

2)  Your error tells:  Invalid option name !.

     Why do you added the ! (exclamation mark) in your code:

   

infile "C:\Users\yegen\Desktop\Malmqvist Files\Malmqvist Files\combined&i..csv"
! delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=2 ;

 

 

 

 

 

 

Yegen
Pyrite | Level 9

Thanks very much @Shmuel. Your suggestion has solved the error. After deleting the !, the macro worked. The reason why I have included ! is that when I manually imported the one of the samples, in the code that was displayed in the log file there was a ! after the file name and before the delimiter. 

Astounding
PROC Star

Also note, this program tries to read files named combined1.csv, combined2.csv, ... combined14.csv not Malm1.csv etc.

Yegen
Pyrite | Level 9

Sorry for the confusion, @Astounding. That was a simple typo that I will fix in my original question. The file names start with combined. Thanks for pointing it out. 

Yegen
Pyrite | Level 9

Also, one more quick clarification question. What exactly does the following do in the code?

MISSOVER DSD lrecl=32767
Shmuel
Garnet | Level 18

Your input file is defined within a delimiter.

Your input statemant is in a format of:  INPUT var1 var2 ...varn;

In case  one of those variables is miising, you probably want to get its missing value,

therefore you need the MISSOVER option.

 

In case there are consequent delimiters, you probably still want to relate to each as skipping to next variable,

and assigning missing value to those variables skipped, that is done by the DSD option.

 

LRECL means Logical Record Length . 

 

I hope it answers your question. 

Yegen
Pyrite | Level 9

This was very clear and helpful. I really appreciate your helpful explanations and help as always, @Shmuel.


Tom
Super User Tom
Super User

Your original LOG from the running PROC IMPORT included the ! because the line of code generated was longer than the line length SAS was using to write the LOG.  Notice how the line number with the ! at the beginning is the same as the one before.

 

You can greatly simplify your data step code since character variable and numbers do not require that you attach special INFORMATs to read them or special FORMATs for them to be properly displayed.  You will have better control of how your variables are defined if they are defined explicitly using a LENGTH or ATTRIB statement. rather than forcing SAS to guess at how you intended to define the variable based on what FORMAT or INFORMAT you attached to it.  Also if you define the variables in the order that they will be read you can simplify the INPUT statement by using a variable list.

 

Also if you want to read ALL of the CSV files with that name pattern (rather than just those with numeric suffixes of 1 to 14) you can actually do it with one data step instead of in a loop. 

 

data malm;
  infile "...\combined*.csv" dsd truncover eov=eov firstobs=2 ;
  length DMUj0 $10 EffPeriod1 EffPeriod2 EffChange 8
    TechChange 8 MalmIndex 8
    group 8 industry 8
  ;
  input @;
  if eov then input ;
  input DMUj0 -- industry ;
  eov=0;
run;

If you don't have a variable in the data that will let you know which source file the record came you can look into using the FILENAME= option on the INFILE statement.

Yegen
Pyrite | Level 9

Thank you so much for this really helpful explanation, @Tom. Your explanation makes a lot of sense and I have learned quite a lot from your post. 

I assume the "\combined*" part does the same as "work.combined:" in the data statement (assuming that all combined files are in the work library), but one needs to use * instead of : since it is an infile statement.


What exactly does the "eov" statement? Does it skip the first line of the data (i.e., variable name)? Also, I could not understand why we need to use @ after input. 

Tom
Super User Tom
Super User

The pattern used in anINFILE statement is like the pattern you would use in the DIR command in DOS/Windows or the ls command in Unix. It really has nothing to do with how you would specify dataset names in SAS code.  Note that the SAS infile statement will only let you use a single wildcard character to replace just one part of the filename. As to the '...\combined' part of the pattern I just cut out the specific directory name you included in your post because it didn't really matter to understanding the answer and it made the line too long.  Obviously you would need to test that the pattern actually matches the exact set of files you want to read. You can also make a fileref that points to multiple files by using a FILENAME statement and then use the fileref you created in the INFILE statement instead of either a quoted physical filenname (as in your example) or a pattern (as in my example).  Here is the format for FILENAME statement that points to mutliple files.  Basically a list of quoted physical filenames inside a parentheses.

 

filename myfiles 
('...\combined1.csv'
 '...\combined2.csv'
 '...\combined3.csv'
)
;

 

The EOV= option on the INFILE statement will name a variable that SAS will set to 1 (true) when you start reading from a new file. It really only makes sense when the INFILE statement is reading from multiple files like in my example.  The 'INPUT @;' line will read the next line from the input file(s) and hold it so that it can be read later in the current iteration of the DATA step. You need it so that the EOV variable will be properly set as it is not set until you start to read from the new file. The code in the IF block statement will cause SAS to read in the first line of the file and basically ignore it. That way the real INPUT statement will not have to deal with the header line. Also SAS will not reset the EOV= variable back to false, so that is why the code also includes a line to set EOV back to zero (false) so that we can then detect the first line of the next file. 

Shmuel
Garnet | Level 18

1)  The line in @Tom's code:

    

infile "...\combined*.csv" dsd truncover eov=eov firstobs=2 ;

   means - all csv files starting with COMBINED and any other characters moe, preceeeding the .CSV

 

2) EOV means End Of Volume. That is a flag you get from sas when last record read from all input files.

     You use EOF= to assign a flag for last observation read from sas dataset

     You use EOV= to assign a flag for last record read from external file(s);

 

3) I leave to @Tom to explain how next code works:

    

input @;
  if eov then input ;
  input DMUj0 -- industry ;
  eov=0;
Yegen
Pyrite | Level 9

Thanks for this helpful explanation, @Shmuel. I see, so EOV notifies SAS that the last observation has been reached and that way, SAS moves to the next dataset, right? 

Thanks for all of your excellent explanations. 

Tom
Super User Tom
Super User

@Yegen wrote:

Thanks for this helpful explanation, @Shmuel. I see, so EOV notifies SAS that the last observation has been reached and that way, SAS moves to the next dataset, right? 


That is the wrong way to think of it. SAS already knows how to read multiple files. The EOV= option is how you can tell SAS what variable to use so that it can tell YOUR program when it has started reading from a different physical file.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 2614 views
  • 5 likes
  • 4 in conversation