BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
GeorgeBonanza
Obsidian | Level 7

I am trying to read a delimited text file with the below proc import but it is producing errors.  The first row of the file contains variable names.

proc import out= work.IMPORT
	datafile= "C:\HAVE.txt"
	dbms=dlm replace;
	delimiter="|";
	getnames=yes;
	datarow=2;
run;

In the log I am receiving these notes:

NOTE: Invalid data for 

Errors detected in submitted DATA step. Examine log.

ERROR: Import unsuccessful. See SAS Log for details.
NOTE: The SAS System stopped processing this step because of errors.

 

It appears that there are blank values in character fields that look like this: |""|

I believe the error occurs when the record is read in and that blank value is converted to missing numeric.

 

Any ideas what I might be doing wrong?

 

Thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Add "Guessingrows=max;"

 

Invalid data is typically because the rules used to guess the variable type and content are set using relatively few records. So if the content changes partway through the data the initial rule for type may be broken.

 

Things that cause this sort of message a variables that should be dates (or times or similar) that have a small number of values like NA, "not recorded", or numbers that have stuff like "< 4". Each specific sort of mismatched data can require a different solution, so provide the entire log, which will show many details about the values encountered.

Often the simple fix is to copy the data step code generated by Proc Import to read text files into the editor and change the INFORMAT statements to better match the data.

View solution in original post

6 REPLIES 6
Ksharp
Super User
Try
dbms=csv
GeorgeBonanza
Obsidian | Level 7
Thanks for taking the time to respond. Unfortunately, I am getting the same errors with this change.
Kurt_Bremser
Super User

No need to use PROC IMPORT. Text files are read with a DATA step you write according to the documentation you received with the file.

For help here, we need to see the file. Open it with a text editor (not with something like Excel, use Notepad or Notepad++), and copy/paste the first few lines (including the header) into a window opened with this button:

Bildschirmfoto 2020-04-07 um 08.32.59.jpg

ballardw
Super User

Add "Guessingrows=max;"

 

Invalid data is typically because the rules used to guess the variable type and content are set using relatively few records. So if the content changes partway through the data the initial rule for type may be broken.

 

Things that cause this sort of message a variables that should be dates (or times or similar) that have a small number of values like NA, "not recorded", or numbers that have stuff like "< 4". Each specific sort of mismatched data can require a different solution, so provide the entire log, which will show many details about the values encountered.

Often the simple fix is to copy the data step code generated by Proc Import to read text files into the editor and change the INFORMAT statements to better match the data.

GeorgeBonanza
Obsidian | Level 7

Thank you for taking the time to respond.  Using guessingrows=max worked.  The variable that was producing errors is a character field but some values look like a string of integers, "1234567890" while others are alpha numeric.  Looking at the data step code produced in the log, it was using best32. as the informat.

 

 

ballardw
Super User

Glad that helped.

 

Preventive measure if you are going to read multiple files that should have the same structure: Save the code generated by Proc import. Next time you need to read a file change the input file name and the output data set if desired. That way all of the variables will stay the same type and you don't need to mess with "fixing" data.

 

Consider if you get a similar file but the column that had only integers is such for the entire column. Then Proc Import will read the column as numeric. Which means that when you go to combine two or more sets you will get a data type mismatch error.

 

I would typically make most of the character variables 10 to 20 percent longer in the saved code, change the Informat, to account for differences of content in the next files.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 2225 views
  • 0 likes
  • 4 in conversation