08-17-2011 11:24 AM
I have a very large file showing subhourly generation by various customers. This file is stored as a csv file. The fields are: ID, dtid, engo, engo_f. ID is the customer id, dtid is a date/time of measurement, engo is generation amount, engo_f is a flag variable. This is how the fields are suppose to be ordered per the documentation I was given from our consultant (please note this contract ended some time ago).
This is the program i used to read this file with a data step:
data Engo ;
attrib dtid format = datetime16. ;
infile '--blankedout---.csv' dlm = ',' dsd obs = 1000000
firstobs = 2 missover ;
input Id $ Dtid datetime16. Engo Engof $ ;
After submitting this program, I seem to be getting correct values for ID, Dtid, and engo_f. I keep getting missing values for the field "engo".
When I try to read the same file with proc import, I get much better results.
options obs = 1000000 ;
proc import datafile = '--blankedout---.csv'
out = Engo replace dbms = csv ;
guessingrows = 125 ;
Except for one strange thing: the column order under proc import becomes (well this is how the output file looks): ID, Dtid, engo_f, engo. My question: 1.) did our consultant make a mistake in their documentation and 2.) if the correct order of the fields are as givne in the output of the second program, then why did the field "engo_f" get picked up correctly in the first program? Are there some options that I am missing and that I should use in the first program?
08-17-2011 11:36 AM
I think I'm starting to sound like a broken record, but I think that your easiest option for discovering the answer would be to
grab the code that proc import creates and simply see how it differs from the code that you submitted.
A nice artice on the method can be found at: http://www2.sas.com/proceedings/sugi30/038-30.pdf
It could simply be a difference in the order of the fields, the field types and/or modifiers that proc import used but you didn't.
08-17-2011 12:25 PM
Then I'd have to think you were provided with an incorrect layout to begin with as proc import would only be reading the variable names from the first record.
08-18-2011 01:17 AM
I think you need colon modifier for your input statement:
input Id $ Dtid : datetime16. Engo Engof $ ;
because your origin code is formatted input method( i.e. datetime16.), so it will eat everything,even delimiter,
you need add : to prevent it happen.
Or remove datatime16. in the input statement, since you already has attrib to format Dtid variable.