BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
wil1212
Calcite | Level 5

Hello,

 

I am trying to import a large CSV file into SAS. The variable names start on row 2 and data starts on row 3. How can I get SAS to start reading at row 2? I cannot open the file and delete the first row, because the file is to large for excel. I have also tried namerow=, but it doesn't work. Any help would be appreciated.

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

My preferred method:

Never let SAS dynamically guess variable names and parameters from the input file; instead rely on documentation of the file structure that MUST be delivered with the data.

In your case, I would

- write a data step that corresponds to the documented file structure

- use firstobs=3 in the infile statement to discard the first two lines completely

- make sure that the code exits with an error condition if the data does not conform to the expected structure

View solution in original post

7 REPLIES 7
ballardw
Super User

If you must use proc import, try opening in another application such as WORDPAD to delete the first row if Excel is having problems.

 

In fact, if the data did not originate in EXCEL and is large enough, opening in Excel and deleting a row open you up to having Excel lose rows and change the values of some of your variables: leading zeroes disappear, some values like "1-5" might get turned into dates and other undesired side effects.

 

I recommend looking closely at the log after using Proc import to see the code generated an ensure that variables have expected data types and formats. If you need to change something, not uncommon as Proc Import is guessing, recall the code from the log and edit and use that program to read the data.

 

And a kludge: read the csv file and write all of the file except the first file out to a different file. make sure the the LRECL parameter is long enough to handle your longest line. Import the newly created file.

 

data _null_;
   infile "input filename" lrecl=4000;
   file "modified filename" lrecl=4000;
   input ;
   if _n_>1 then put _infile_;
run;

 

 

Reeza
Super User

Can you post a sample of the file and your current code?

What OS are you on?

 

Assuming Windows, you can extract a sample of your file to work with using PowerShell 

https://jamesmccaffrey.wordpress.com/2011/08/25/copying-the-first-n-lines-of-a-text-file-using-power...

 

 

wil1212
Calcite | Level 5

I am using Windows 12. The file has 34 variables, and I am unsure how many rows.

 

proc import datafile="D:\projects\Billing History\ResBillHis2015.csv"

out=ResBillHis2015

dbms=csv

replace;

namerow=2;

datarow=3;

 

run;

Ksharp
Super User

These code could delete the blank row , but not efficient .

 


data _null_;
file "/folders/myfolders/new_edu.csv";
infile "/folders/myfolders/edu.csv" lrecl=32767;
input;
if missing(_infile_) then delete;
 else put _infile_;
run;

proc import datafile="/folders/myfolders/new_edu.csv"
out=x
dbms=csv
replace;
run;
SASKiwi
PROC Star

I don't think NAMEROW =  is a valid statement.

Try adding a DATAROW statement: datarow=3;

 

Alternatively use a text editor that can cope with the size of the file to delete the first line, possibly Wordpad (comes with Windows). There are also a lot of free text editors available on the internet.

Ksharp
Super User

Very Interesting question. Two thing I can thing.

1) use

proc import datafile="/folders/myfolders/edu.csv"
out=x
dbms=csv
replace;
datarow=3;
run;

 

import firstly then after that rename it .

 

2) copy the code genereated by proc import  from LOG and modify it by hand .

use 

options obs=10;

when you run proc import to make it faster .

Kurt_Bremser
Super User

My preferred method:

Never let SAS dynamically guess variable names and parameters from the input file; instead rely on documentation of the file structure that MUST be delivered with the data.

In your case, I would

- write a data step that corresponds to the documented file structure

- use firstobs=3 in the infile statement to discard the first two lines completely

- make sure that the code exits with an error condition if the data does not conform to the expected structure

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 12919 views
  • 0 likes
  • 6 in conversation