02-12-2015 04:43 PM
I'm trying to import a large tab delimited file which contains a notes field for which some records have special characters that seem to look like tabs. When SAS imports this field for a record with a special character everything after the special character is assigned to the next field, and the rest of the data is pushed further downstream with each special character included. Is there a way to ignore these special characters in the import?
I've experimented with importing the file line by line and removing the special characters, but when I export the data it loses the tabs and the first line is removed. Any code examples that will remove the special characters while keeping all of the characteristics of the text file would be helpful.
02-12-2015 04:53 PM
If the value of the field includes the delimiter then the value should be enclosed in quotes. If it is not then see if you can have the file re-generated with proper formatting. If they cannot generate the file with quotes around values that contain the delimiter then ask them to use a different character as the delimiter.
To see the actual values in the file you can use a DATA _NULL_ step with the LIST statement.
infile 'myfile.txt' firstobs=1 nobs=10 ;
02-12-2015 05:00 PM
Thanks for the reply Tom. I don't have any control over this file. There has been a larger discussion about putting the notes last so this isn't an issue but things move slowly here.
02-12-2015 07:15 PM
I would be forceful with them. Without proper formatting the file could be totally un readable.
If there is just one field that could have multiple tabs then you can fix it by parsing out the fields before and after the field with tabs first.
Search the forum as this question has been asked many times already.
02-12-2015 07:24 PM
This is the least of their problems. I think this is number 348 on the list of things to do.
I've seen a few discussions but no solutions. It's hard to find the right keyword search combinations to cut through all of the other topics addressed by the same keywords.