BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
A_SAS_Man
Pyrite | Level 9

I am importing a csv file into SAS as a part of a larger process. I am using the code below.

 

proc import datafile="/data/data.csv" 
out=test
dbms=csv replace;
GUESSINGROWS=MAX;
 run;

When I run this, it generates the following Note, but it seems to act as an error would because it makes my process stop and it is underlined in red. But again, I have no "Errors" when I look at my log summary just a few warnings and notes.

 

SAS Error.PNG

 

In addition, even though it stops my process it actually successfully generates the output data set. Can anyone help me understand what is going on with this and how I could potentially fix it?

 

Edit: The entire log is very long but I believe based on @ballardw's comment I have found the more relevant part. It seems as though multiple column names are being truncated and coming in with the same name, so after the first instance of that column it generates an error. See below extended log information.

 

172               informat PresenceOfDiabetesType2InsulinDe best32. ;
173               informat "CurrentlyBeingTreated-DiabetesTy"N $1. ;
174               informat PresenceOfDiabetesType2NonInsuli best32. ;
NOTE: SCL source line.
175               informat "CurrentlyBeingTreated-DiabetesTy"N best32. ;
                                                               _______
                                                               485
NOTE 485-185: Informat $BEST was not found or could not be loaded.

176               informat PresenceOfDiabetesType1 best32. ;
NOTE: SCL source line.
177               informat "CurrentlyBeingTreated-DiabetesTy"N best32. ;
                                                               _______
                                                               485
NOTE 485-185: Informat $BEST was not found or could not be loaded.

Farther up in the log you can see this as well:

 

Name CurrentlyBeingTreated-DiabetesType2Insulin truncated to CurrentlyBeingTreated-DiabetesTy.
Name PresenceOfDiabetesType2NonInsulinDependent truncated to PresenceOfDiabetesType2NonInsuli.
Name CurrentlyBeingTreated-DiabetesType2NonInsulin truncated to CurrentlyBeingTreated-DiabetesTy.
Name CurrentlyBeingTreated-DiabetesType1 truncated to CurrentlyBeingTreated-DiabetesTy.

So thank you for potentially finding that issue, is there a way to modify proc import to allow longer variable names or something along those lines? These files are provided from a vendor so I have no control over the naming conventions and am trying to keep this as automated as possible.

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

The problem is stemming from extremely long column headings. Note that line 173 and 175 are attempting to assign different informats and formats to multiple columns with the same name after truncation to the allowed 32 characters. I have had similar data with long column headings but never used Proc Import with Validvarname=any, which allows use of the name literals. The Import generator often fixes some of this with similar names suffixed with 1,2,3 but very long names seem to be the exceptions  You have the data data step; copy to the editor and change the names of one of the variables (in each pair, I think you have a fair number). You can drop most of the FORMAT statements and likely the Informats using BEST32. Just make sure that the Input statement has different variable names. If you remove informat statements the order of columns in the data set will change. If you are extremely careful with editor search and replace you can find the second of each of the Informat and Input occurrences of the name and change just that one. Save the code and rerun.

 

173               informat "CurrentlyBeingTreated-DiabetesTy"N $1. ;
174               informat PresenceOfDiabetesType2NonInsuli best32. ;
NOTE: SCL source line.
175               informat "CurrentlyBeingTreated-DiabetesTy"N best32. ;
                                                               _______
                                                               485
NOTE 485-185: Informat $BEST was not found or could not be loaded.

 

View solution in original post

15 REPLIES 15
ballardw
Super User

How about showing the entire log of the data step generated and executed by Proc Import.

 

That specific message means that the variable has already been declared as character and as such  a numeric informat, or format, cannot be used for that variable. That is what the $best indicates, that a character variable is used.

 

 

Reeza
Super User
That would be a really weird error for a standard proc import code to generate. Did you modify the code generated and then get those errors?
A_SAS_Man
Pyrite | Level 9
I'm not really sure what you mean, I ran the code above in my original post and it generated that output.
Reeza
Super User

Definitely need to show the full log as @ballardw requested.

A_SAS_Man
Pyrite | Level 9
Thank you, I posted more information from the log which I think gets at what you are saying.
ballardw
Super User

Start in the log at the Proc Import line. Copy everything through the the notes at the end of the run about observations read.

Paste all of that into a text box.

 

The note about SCL code often means that something with a dot is used incorrectly but Proc Import generally doesn't do that.

The addition of truncated variable names is not related.

 

I would also suggest setting: Options validvarname=v7; to stop creating those name literal variables with hyphens, spaces and other non-standard characters in the variable names. It is going to be bad enough dealing with those extremely long variable names without adding the requirements to repeatedly address the "long var name"N literal on top.

A_SAS_Man
Pyrite | Level 9
It should now be attached to my original post.
A_SAS_Man
Pyrite | Level 9
The addition of that option actually solves the issue, repeat variable names are just imported formatted as "VAR131" or similar.
Reeza
Super User

I think for a file with these type of issues I'd copy the code from the log, manually modify it to be correct including renaming variables as needed and use that to import your data instead of PROC IMPORT. This is a better approach for an automated process anyways as it will also ensure that your variables are always read correctly and the same type/format which is not guaranteed with PROC IMPORT. 

A_SAS_Man
Pyrite | Level 9
I didn't realize proc import was just using a data step. That is really helpful, going to do some testing on that now.
ballardw
Super User

The problem is stemming from extremely long column headings. Note that line 173 and 175 are attempting to assign different informats and formats to multiple columns with the same name after truncation to the allowed 32 characters. I have had similar data with long column headings but never used Proc Import with Validvarname=any, which allows use of the name literals. The Import generator often fixes some of this with similar names suffixed with 1,2,3 but very long names seem to be the exceptions  You have the data data step; copy to the editor and change the names of one of the variables (in each pair, I think you have a fair number). You can drop most of the FORMAT statements and likely the Informats using BEST32. Just make sure that the Input statement has different variable names. If you remove informat statements the order of columns in the data set will change. If you are extremely careful with editor search and replace you can find the second of each of the Informat and Input occurrences of the name and change just that one. Save the code and rerun.

 

173               informat "CurrentlyBeingTreated-DiabetesTy"N $1. ;
174               informat PresenceOfDiabetesType2NonInsuli best32. ;
NOTE: SCL source line.
175               informat "CurrentlyBeingTreated-DiabetesTy"N best32. ;
                                                               _______
                                                               485
NOTE 485-185: Informat $BEST was not found or could not be loaded.

 

Reeza
Super User

Yup! It just generates the data step code. I often use it as my starting point to write code for large files (lazy/efficient) method IMO. 

Then modify as necessary 🙂

I would use @ballardw suggestion to re-run with option validvarname=v7 to clean up some of those names as well.

 

FYI - if you're using SAS Foundation (not Studio) you can copy the code without the line numbers by holding down ALT + CNTRL while selecting the code. I use Studio so use NotePad++ to remove the line numbers these days. 

 

A_SAS_Man
Pyrite | Level 9
I am also using studio, can you expand on how you use notepadd++ to remove line numbers?

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 15 replies
  • 1412 views
  • 10 likes
  • 4 in conversation