BookmarkSubscribeRSS Feed
DivyaGadde
Fluorite | Level 6

Hi!

 

I am a beginner and started using SAS for a few months. I've got a problem and would like to use the help and advice of the SAS experts here.

I have multiple CSV data files with the same set of variables in every raw data file (also all the files are present in the same directory). I would like to import all of them once and create a new SAS.

 

Thanks for all your help in Advance!

5 REPLIES 5
Tom
Super User Tom
Super User

Since CSV files are just text there is no need to "import" them.  Just read them.

If you really don't know what the structure is you could use PROC IMPORT to read one and then use the code it generates to help you writing code to read the file.  But it writes really ugly code over complicated code.

 

As to the multiple file issue having all of them in one directory helps. Especially if you want to read ALL of the CSV files in that directory. In that case you can use a * wildcard in the file name of the INFILE statement.  The only tricky part will be if the files have header lines.  For that you can use the FILENAME= option of the INFILE statement.  That is also useful if there is information in the name of the file that is not already in the file.  Like a date or company name.

 

data want;
   length fname $200 ;
   infile 'c:\mydir\*.csv' dsd dlm=',' truncover filename=fname;
   input @;
   if fname ne lag(fname) then input;
   input var1 var2 .... ;
run;
DivyaGadde
Fluorite | Level 6

Thank you for your quick response.

 

Do I have to list out all the variables in the input statement? 

Tom
Super User Tom
Super User

@DivyaGadde wrote:

Thank you for your quick response.

 

Do I have to list out all the variables in the input statement? 


Yes and no.  You need to write a data step that reads all of the variables.  But you can use short cuts so that the actual INPUT statement does not need to list each variable name.  One way is if you have a series of variables that are repeated measures of a similar value that you want to name consistently.  Say you had 6 date variables in a row.  You could name them date1,date2, ... date6 and then in code use DATE1-DATE6 to refer to all 6 at once.

In particular for reading from delimited files I prefer to define the variables first.  For example by using the LENGTH statement to tell SAS whether the variable is character or numeric and how many bytes to reserve to store the values.  $ indicates a character variable.  SAS stores all numbers as 8 byte floating point numbers, so length should be 8.  (You can store only part of the 8 byte floating point number, but then you lose precision).

data want;
   infile 'myfile' dsd truncover ;
   length 
     Firstvar $20
     nextvar 8
...
     next2lastvar $10
     lastvar $500 
   ;

One way to help with that is to just copy the first line from the CSV file and turn it into the LENGTH statement.

 

Then the INPUT statement can use a positional variable list.

  input firstvar -- lastvar;

 

If any of your variable require special informats to be used when reading from the text file or special formats so that their values are clear to humans (this mainly means date, time and datetime values) then add an INFORMAT and/or FORMAT statement to attach the appropriate informat or format to those variables. Most variables do not need either an informat nor a format attached.  For example you might have a date string in you file in MDY order, in which case you would use this informat.

  informat mydate mmddyy.;

And now that you have a date value you need to use one of the many date type formats so it will display in a human understandable way.  I prefer to avoid MDY or DMY order for dates as they are potentially confusing.  Is 10/12 the tenth of December or November twelfth?

  format mydate yymmdd10.;

You could include informat specifications in the INPUT statement.  But when reading a delimited file, like a CSV file, make sure to prefix the informat with : so that SAS knows to still read the field in list mode so that it honors the delimiters in the line.

  input ... .mydate :mmddyy. ... ;

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 5965 views
  • 2 likes
  • 4 in conversation