07-08-2015 03:35 PM
(Hopefully I am asking this question in the right place)
Within SAS EG, I am importing system performance metrics via PROC IMPORT from CSV files. This works great. For instance, here is a CSV file. Note that Line 1 is setup as the field names.
date_date, date_time, system, disk-name, pctbusy, avserv,avwait
06/29/12, 21:50:00, myserver, sda1, 12.000. 1.000, 1.000
Here is my issue - My new exports are coming in with additional NEW fields in the CSV file:
date_date, date_time, NEW-FIELD, system, disk-name, avserv, avwait
06/29/12, 21:50:00, NEW-VALUE, sda1, 12.000, 1.000, 1.000
Here is my question - How do I get my PROC IMPORT to be able to read both old and new format CSV files? Is there a way for it only care about the field name regardless of its placement?
07-08-2015 03:45 PM
A preferred method would be to talk to the source of the file and do one of two things:
Agree on a constant format that the file will be supplied (best practice as everyone involved then knows what is going on and will save much heartache over time)
Or have all "new fields" appear to the right of the existing data.
Either of these would be better off served with a data step to read the data as you have better control over results and you aren't likely to have issues with changing lengths of text variables or missing values as the result of reading what should be character variables with a default numeric (based on guessing rows).
I would pose a question though: what process do you have that actually cares about the order of the variables in the data set? If the answer is nothing, then don't sweat it.
07-08-2015 04:00 PM
PROC IMPORT doesn't care as it just adapts to what it sees.
SAS is actually much better at handling this type of sloppy data than many database systems.
So for example if you want to append the new records to an existing table you could use the FORCE option on PROC APPEND and it will ignore the new fields.
Also PROC APPEND or data step merges do not care if the fields are in the same order since it matches on name and not on location.
Now if they change the NAME of a field then you might need to program for that.
07-09-2015 11:02 AM
Thanks for the input Tom and Ballardw!
I really do not have any control over the way the data comes in because it comes from a popular system monitoring tool. The CSV files will either come in the old format CSV, or the new format. The field names do not change. There may just be new fields.
I tried PROC IMPORT (for the first time) rather than doing a file import in SAS EG. I was able to "flip" field around successfully.
My next question though is - Can PROC IMPORT do field formatting? (i.e. dates, numbers), or if not, how would you do that?
Sorry to ask so many questions. Any advice is appreciated.
07-09-2015 11:11 AM
No. Proc Import will do a best guess based on the content and for some items may get what you want but no guarantees.
If you want to set variable properties such as format or labels after the data is read you can do that with Proc Datasets.
Or you can set the properties on a data set that the incoming data is appended to.
07-09-2015 04:45 PM
Okay, I'm getting there slowly.
My new issue is though that PROC IMPORT is holding onto old CSV definitions by inserting an EFI into my code. It is ignoring my PROC DATASETS completely.
How do I tell my PROC IMPORT to really use the field names and stop using the old field names.
07-09-2015 06:48 PM
How do I tell my PROC IMPORT to really use the field names and stop using the old field names
Show the code.
Proc import should not have any memory of any sort.
If you are referring to the EFI sections in the data step generated to read csv, they are only reacting to error codes.
It sounds like you may have a error an not replacing an existing data set.
Do you have a QUIT at the end of the Proc datasets? It is one of the procedures that supports run groups and uses QUIT to end the procedure.