An Idea Exchange for SAS software and services

Comments
by Super User
on ‎09-26-2014 05:26 AM

Not sure what others opinions are, but delimited files (CSV etc.) to my understanding are rows of data, separated by a delimiter, one record per row, with an *optional* header row.  Ok, delimited is not as structured as XML for instance, however in most cases you should only see data, or first row header information then data.  So I don't see what the value of adding an option to read headers from non-first row would be as they then are not CSV but who-ever wrote the file's own format.  How many case do we want to cover there?  (reference: Comma-separated values - Wikipedia, the free encyclopedia, http://tools.ietf.org/html/rfc4180)

As a side note, the infile is a far better option for reading delimited files anyways, proc import is left in just to give headaches I am sure.

by Trusted Advisor
on ‎10-13-2014 06:35 PM

You can define ranges in excel

Named rangés are used by sas to read that area

This option is already available not well documented.

Proc import for xlsx not old style xls 2003 is not having that. For xlsx files it would make more sense to follow open office type as ms excel is based on that.

by Contributor LearnByMistk
on ‎11-03-2014 02:59 PM

getnames option is valid for that

by Occasional Contributor Lin_Clare
on ‎02-08-2015 07:51 PM

seems only valid if the variable names in the first row

by Trusted Advisor
on ‎02-09-2015 01:31 AM
by Occasional Contributor Lin_Clare
on ‎02-10-2015 12:19 AM

The you will get default variable name Var1, Var2.... while in fact you have variable names in the raw file, just not located in the first row. sometimes if a file have many variables like hundreds, it's quite useful if we can get all the variable names automatically.

by Trusted Advisor
on ‎02-10-2015 01:03 AM

There is no way to predict all kind of possible approaches. 
For the common conventions SAS is having already much trouble to keep the interfaces up to date (eg Open office  Json).
With the datastep you can program all kind of own needed conventions. That is why programming needs exists. (no creationists) 

by Occasional Contributor RobP
on ‎04-14-2015 03:39 PM

Seems unnecessary to me.  For unusually formatted files, use a datestep and infile.  I'd rather see the dev team work on other problems that aren't easily solved.

by Contributor lloydc
on ‎04-15-2015 03:18 PM

I'd vote yes. Last year I had a problem reading in CSV files created by a hardware vendor's report. They put their own report name into the CSV file, it would be the first data record. The field names were on the second data record but GETNAMES doesn't allow specifying which row. I wound up having to use a data statement, strip out the first record, then pass the output file to PROC IMPORT. Messy and aggravating.

by Trusted Advisor
on ‎04-15-2015 04:29 PM

Lloydc for you frustrating at that time bu not messy and aggravating.

The real mess is coming from the surprises of the guessing approach of proc import never sure what the decision will be.

You could have got a data checksum or more as first record, the data could have been delivered by not 1 but multiple lines (many variable options).
Any construction well documented as should be can be reliable handled by a datastep.   
Guessing with surprises is something done in a casino (monte carlo   other area of statistical approaches). 

Idea Statuses
Top Liked Authors