BookmarkSubscribeRSS Feed

There are some common variations on how to create delimited files used by data software, like Excel and Redshift, that are currently not directly supported by SAS.  Of these there are two main ones that I think should be easy to implement and solve the most common problems.

 

Support for reading and writing files with embedded end-of-line characters. 

This is common for CSV files generated by EXCEL.  Currently SAS can only handle files where the embedded line break characters do not exactly match the end-of-line characters.  So SAS can read a file that uses CR+LF as end of line and has embedded CR or LF characters, but not records that have embedded CR+LF combinations.  But when the field is properly quoted this is a format that Excel (and others) do support. 

 

This one is especially annoying since it is difficult to program around this limitation.  Currently to read files like this users need to pre-process the whole file to remove or replace the embedded line break characters.  And for writing files SAS does not currently automatically add quotes around field values that contain the end-of-line characters.  So users might need to use the ~ modify on PUT statements to force SAS to add quotes and end up with files that have more quotes than are really needed.

 

Support for the use of an escape character

To protect special characters like delimiters, CR, LF, quotes and the escape character itself a lot of databases (and programming languages) use an escape character, typically a backslash.  This method of protecting special characters in text is popular with many languages, for example Unix command shells, and has been adopted by major database platforms like Redshift.  Typically this is done instead of adding quotes around values that contain special characters. Some even try to add quotes around the values but instead of doubling the quotes that are in the data they add this backslash character.

SAS should be able to read a delimited file that is using an escape character instead of quoting to protect embedded special characters. And it should be able to write files that use escaping instead of quoting to protect special characters.

11 Comments
ChrisNZ
Tourmaline | Level 20

Very useful initiatives Tom!