Does anyone know of a way of getting SAS to read in files delimited by more than one character? Some files I have to import have over a million records in them, containing customer details entered by call-centre staff over a period of several years. I can't find one single character that will successfully delimit all the records! I figure it will be harder to find a string entered in a record that contains '^$*' than just '^' or '$' or '*'.
I create the files overnight, typically taking the form of:
NAME^AGE^ etc... ('^' being the current delimiter)
Any ideas? I'm using 9.1
You can list multiple delimiters in the DLM= option of the INFILE statement. Delimiters placed next to each other would be treated as one delimiter. Records with only one of the delimiters would be read successfully.
infile datalines dlm='^$';
input name $ age city $ state $;
proc print data=readmult;
title 'Use Multiple Delimiters';
I don't think this is really what is asked for. I interpret the post as the delimiter always must all of the specified characters in that sequence, just to avoid that some character fields may contain any single of the specified characters.
I don't think that you can do this within the INFILE statement (it would be a great feature though). So I think that you'll have to scan through the whole record searching for the delimiter, maybe using do-loops, FIND function and SUBSTR...
Just about all import routines (may be not oracle, but I'll be happy to be corrected) have a solution to the problem of finding a delimiter character within data values. To protect these (I call delimiters embedded in data) they all support the principle of "Quoting."
The easiest example would be an amount value formatted with the comma. format.
SAS supports the common solution used by (presumeably) all - Quote any value that contains an embedded delimiter.
Ok you may challenge, "what about the quote character turning up in a data value?"
The solution to this seems as standard as the principle of "quoting" - Protect quotes by quoting the string and repeating internal or unbalanced quotes. [pre] Mary said, "let's finish now."[/pre]would be converted to a protected form as [pre] "Mary said, ""let's finish now""."[/pre]
With these methods, text data transfers are achieved in a CSV style that is protected and reliable.
SAS supports this with the DSD option on file statements for creating CSV style, and on the INFILE statement for reading such demimited data.
On no occasion is a more complex delimiter (like a multi-character delimiter word) required.
I tried the 3rd option as it seemed most like what I was after:
Creating the files with the fields within double quotes (""^""^ etc) worked beautifully - my files imported with no errors or missing values! (1st time in almost a year!)
Thanks to all who replied.