Help using Base SAS procedures

reading a csv file starting from 4th row using proc import

Accepted Solution Solved
Reply
Super Contributor
Super Contributor
Posts: 440
Accepted Solution

reading a csv file starting from 4th row using proc import

i have a  csv file and  data starts  from  4th road (3rd road is variable names).

Is there a way i can  import it  using proc import?


Accepted Solutions
Solution
‎10-16-2015 12:04 PM
Super User
Super User
Posts: 7,392

Re: reading a csv file starting from 4th row using proc import

Its not really a CSV file then is it.  CSV file has row 1 with column names, row 2 data starts and both are delimited by comma's.  You have a CSV "like" file.  Step one, infile the datafile, read whole thing and if _n_ > 2 then output to new file.  Step 2, read in the new file with just the headers and data.

View solution in original post


All Replies
Frequent Contributor
Posts: 144

Re: reading a csv file starting from 4th row using proc import

Use this statement in the proc import
DATAROW=4;
Super User
Posts: 10,458

Re: reading a csv file starting from 4th row using proc import

If you want to get the variable names from the file I believe that you'll need to delete or cut and paste the first two rows to another file. Currently proc import expects the variables to be on the first row.

 

Solution
‎10-16-2015 12:04 PM
Super User
Super User
Posts: 7,392

Re: reading a csv file starting from 4th row using proc import

Its not really a CSV file then is it.  CSV file has row 1 with column names, row 2 data starts and both are delimited by comma's.  You have a CSV "like" file.  Step one, infile the datafile, read whole thing and if _n_ > 2 then output to new file.  Step 2, read in the new file with just the headers and data.

Super Contributor
Super Contributor
Posts: 440

Re: reading a csv file starting from 4th row using proc import

we  are trying to  put one  process  in production so  i cant  really play  with the  file (cutting,copying,pasting...) and  that file will come  in externaly as a  CSV one. So far they been sending a xls file where  data starts from 4th row but it was  easy to import it using the "startrow" and "namerow" within the proc import which does not work when it comes   to CSV.

So:

1.  i should hope the csv file will always  have  the  headers  in the  first row and  data  from the second  onwards?

2. infile the  csv file and   read in  from _n_>2 ?

 

Thank you guys

Frequent Contributor
Posts: 144

Re: reading a csv file starting from 4th row using proc import

proc import datafile="path/Have.csv"
            dbms=csv
            out=Want
            replace;
         datarow=2;
run;

 

As says, you could not get the names with proc import if doesn't start in the first row

Community Manager
Posts: 2,756

Re: reading a csv file starting from 4th row using proc import

PROC IMPORT will help you with a well-formed CSV file -- glean variable names from the first row, and "guess" the variable types based on the remaining data rows.  But as others point out, this depends on having data begin in the second row.

 

If there are blank rows, or rows that contain information that is not data, then PROC IMPORT might not behave as well.  In this case it might be better to use INFILE with a CSV file.  But you'll need to define the variable attributes, informats, and exert more control over how the file is read in.  

 

Behind the scenes, PROC IMPORT generates DATA step code to do its work, so you might consider just short-circuiting that and write your own DATA step for full control.  More work, but brings you more flexibility over reading a file whose format you don't control.

 

Chris

Super User
Posts: 10,458

Re: reading a csv file starting from 4th row using proc import

If you know what the structure of the file will be, as in agreed upon by the providers and users, and the data columns remain in the same order and you know the characteristics such as length, format and data type then you use a data step to read the file. The infile statement would use the firstobs=4 or whatever to point to the data row.

 

You can use Proc import to give you a leg up on writing the code by 1) putting an example file in a nice format such as row 1 = headers, 2) run proc import. The log will have the data step code generated to read the file. Copy that out and save as a separate program file. 3) modify the informats, formats and add label code and 4) change the Firstobs to match the actual data. I would recommend modifying the code to use a filename statement and fileref in the data step for running against later data sets.

 

With the data step code the only concerns become undocumented changes such as change of order of variables, changed number of header rows and changed lengths of variable contents.

Super User
Super User
Posts: 7,392

Re: reading a csv file starting from 4th row using proc import

Well, you have some technical advice above.  I would just like to add that it is all in the documentation.  Ensure you have a document which describes the CSV data.  This should be signed by both parties and ahered to.  The programming is a very simple matter then, merely coding the information in the agreement, but without an agreement you could end up changing things each time.  Simple programming advice: 

Document

Agree

then

Code.

Super Contributor
Super Contributor
Posts: 440

Re: reading a csv file starting from 4th row using proc import

yea CSV files indeed come as you  said RW9 so i panicked  for no reasons  but  thank you guys for the  hints

Valued Guide
Posts: 765

Re: reading a csv file starting from 4th row using proc import

Hi.  If you have an XLS file with the first two rows blank, column names in row 3, data starting in row 4 ... try this with the attached XLS file ...


libname x 'z:\test.xls';

 

data new;
set x.'sheet1$'n ;
run;

 

libname x clear;

 

data set NEW ...

 

Obs     x     y     z         date    name

 1     10    20    30    15OCT2015    HEY
 2     99    99     .    01JAN1960    IT
 3      1     2     3    11NOV2011    WORKS

 

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 2867 views
  • 7 likes
  • 6 in conversation