DATA Step, Macro, Functions and more

IMPORT A CSV FILE into SAS

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 9
Accepted Solution

IMPORT A CSV FILE into SAS

I know this questions has been discussed before but I am not sure what is the difference between these 2 approaches ...  1st approach is correct working as desired and 2nd approach is putting everything in 1st variable.

 

Approach 1:

data want;
INFILE 'filepath/filename.csv'
LRECL=256 ENCODING="LATIN1" DLM='2c'x MISSOVER DSD firstobs=2;
INPUT
var1 : $CHAR100.
var2 : $CHAR50.
var3 : ??BEST8.
var4 : $CHAR100.
var5 : $CHAR10.
var6 : $CHAR150.;
run;

 

Approach 2:

data want;
infile "filepath/filename.csv/"
lrecl=256  dlm="2c"x  missover  dsd  firstobs=2  encoding="LATIN1";
input var1 $ 1-100 var2 $ 101-175 var3 176-190 var4 $ 191-215 var5 $ 216-225 var6 $ 226-325;
run;

 

Also, I want to start reading the file from row2, can i still use PROC IMPORT to import this CSV file and get the desired dataset?

 

Regards,

Vinni


Accepted Solutions
Solution
‎02-20-2018 12:41 PM
Super User
Super User
Posts: 8,075

Re: IMPORT A CSV FILE into SAS

The right answer depends on how the file you are reading is structured. If it really is a CSV file then using the list mode style INPUT statement like in the first example is correct.  The INPUT statement in the second example if what you would do if the input file was using fixed column locations to place the data for each field instead of using delimiters on the line to marge where the fields start and stop.  But if your fields are fixed length then you would not want to include the DSD option on the INFILE statement. And you would definitely want to set the LRECL on the INFILE statement longer than the number of columns you were trying to read in the INPUT statement.

 

Personally I would use a data step like this.  I find it helps a lot if you explicitly define the variables instead of forcing SAS to guess how you wanted them defined based on how you first use the variable. And SAS does not need to be told what INFORMAT to use for most variables.

 

data want;
  INFILE 'filepath/filename.csv' dsd firstobs=2 truncover ;
  length var1 $100 var2 $50 var3 8 var4 $100 var5 $10 var6 $150;
  input var1-var6 ;
run;

If you really wanted to suppress notes in the log about invalid numbers for `VAR3` then you could use this for the INPUT statement instead.

  input var1 var2 var3 ?? var4-var6 ;

Note there is no informat named BEST.  If you ask for it SAS will just use the normal numeric informat. In some ways the COMMA informat is a "best" informat for numbers as it will ignore commas and dollar signs that might appear.

Note that CSV files normally have leading spaces stripped already.  Do you really want to preserve any leading spaces that might appear in your input file by using the $CHAR informat instead of just reading the character fields normally?

 

View solution in original post


All Replies
PROC Star
Posts: 1,276

Re: IMPORT A CSV FILE into SAS

You can use the GETNAMES=NO statement in proc import to start at the second row.

Occasional Contributor
Posts: 9

Re: IMPORT A CSV FILE into SAS

Hi Draycut, thanks for your reply.

 

How would you solve the issue for multiple lengths in the variable as in say first observation is 50 characters and second observation is 60 characters.Is there any way to change the length of the variable in PROC Import code?

Solution
‎02-20-2018 12:41 PM
Super User
Super User
Posts: 8,075

Re: IMPORT A CSV FILE into SAS

The right answer depends on how the file you are reading is structured. If it really is a CSV file then using the list mode style INPUT statement like in the first example is correct.  The INPUT statement in the second example if what you would do if the input file was using fixed column locations to place the data for each field instead of using delimiters on the line to marge where the fields start and stop.  But if your fields are fixed length then you would not want to include the DSD option on the INFILE statement. And you would definitely want to set the LRECL on the INFILE statement longer than the number of columns you were trying to read in the INPUT statement.

 

Personally I would use a data step like this.  I find it helps a lot if you explicitly define the variables instead of forcing SAS to guess how you wanted them defined based on how you first use the variable. And SAS does not need to be told what INFORMAT to use for most variables.

 

data want;
  INFILE 'filepath/filename.csv' dsd firstobs=2 truncover ;
  length var1 $100 var2 $50 var3 8 var4 $100 var5 $10 var6 $150;
  input var1-var6 ;
run;

If you really wanted to suppress notes in the log about invalid numbers for `VAR3` then you could use this for the INPUT statement instead.

  input var1 var2 var3 ?? var4-var6 ;

Note there is no informat named BEST.  If you ask for it SAS will just use the normal numeric informat. In some ways the COMMA informat is a "best" informat for numbers as it will ignore commas and dollar signs that might appear.

Note that CSV files normally have leading spaces stripped already.  Do you really want to preserve any leading spaces that might appear in your input file by using the $CHAR informat instead of just reading the character fields normally?

 

Occasional Contributor
Posts: 9

Re: IMPORT A CSV FILE into SAS

Thanks a ton Tom, 

Your contribution is really appreciated!

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 232 views
  • 0 likes
  • 3 in conversation