BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
AndersS
Pyrite | Level 9

Hi! The problem, as I understand it, is to read the contents of a .csv file, into a SAS Data set, with several columns. This solution only has one(1) column. This is in my opinion NOT a correct solution!
/ Br Anders Sköllermo

Anders Sköllermo (Skollermo in English)
gergely_batho
SAS Employee

You are right. The correct solution comes from all the answers here. Let's summarize:

- if file is really large, and you don't want to wait for PROC IMPORT to read it: take only the first N rows from the file. You can do it

               with UNIX head -N

               or by writing a data step with infile+file+input+put statements. Include obs=N on the infile statement.

- run PROC IMPORT once, possibly with a low guessingrows= parameter.

- copy the data step code generated by PROC IMPORT

- you can modify the data step by including firstobs=2 and perhaps obs= (if you want to limit number of obs to read (e.g. for testing purposes))

- add truncover or missover if needed

- you might need to modify the generated input statement if the guess of PROC IMPORT was wrong

- compress=yes

Now your data step is  ready to run with obs=max (to import all the observations).

Message was edited by: Gergely Bathó

Tom
Super User Tom
Super User

You can read it into variables using dummy variable names.

For example to read first 20 columns for the first 20 rows.

Use FIRSTOBS=2 to skip the header line.

data sample;

  infile 'myfile.csv' dsd truncover lrecl=300000 firstobs=2 obs=20 ;

  length x1-x20 $200 ;

  input x1-x20 ;

run;

data_null__
Jade | Level 19

Isn't TRUNCOVER for FORMATTED and/or COLUMN?

Since delimited input is a form of LIST input would there ever be a situation where TRUNCOVER would be needed.  MISSOVER seems more appropriate.

I know it serves the same role here but it seems wrong.

Tom
Super User Tom
Super User

In my opinion MISSOVER is deprecated and should never be used unless you really do want to throw away short values.

data_null__
Jade | Level 19

The point is there is no such thing as "short values" when reading with LIST input.

Tom
Super User Tom
Super User

I agree that the effect is the same for this program. However since it is an INFILE option and not an INPUT option there can be a disconnect.

Since the MISSOVER option is so wrongly named and potentially harmful for other types of INPUT statements I think it is important that its use be discouraged.

SASKiwi
PROC Star

In my experience using guessingrows with a large value can drastically slow down the import of CSV files.

Behind PROC IMPORT SAS builds DATA step code when reading CSVs. You can see this in the SAS log. Just copy this DATA step code into your SAS editor and run it. You will be amazed how much quicker it is.

AndersS
Pyrite | Level 9

Hi! I used to import data from Excel-sheets some years ago.
The data were characters (text)   and   characters(numbers, to be interpreted as numbers in SAS)   and   numbers (with period or comma as decimal sign).
My solution: Read a data line into a long character variable CV (32000 bytes).
Then read CV from the left and decide what is alphabetic text, what is numbers, etc. - and all the details.
In my problems I had a column structure from Excel, like:  CustomerId (number)     CustomerName (text)    NumericValue (period or comma, perhaps E-format)    next   NumericValue, etc

Basically you have to use this structure to program the reading of the parts of the string.
We used this in production. 5000-10000 lines in Excel sheets, very mixed style.
Send me an email to anders.skollermo@one.se    plus more info. I will solve it. / Br Anders Sköllermo

Anders Sköllermo (Skollermo in English)

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 23 replies
  • 13939 views
  • 8 likes
  • 7 in conversation